POLYNOMIAL DYNAMICS 



ALICE MEDVEDEV AND THOMAS SCANLON 

Abstract. We study algebraic dynamical systems (and, more generally, a- 
varieties) <t> : AJi — * AJ5 given by coordinatewise univariate polynomials, 
. . . , x n ) = (fi{x\), . . . , f n (x n )) by refining an old theorem of Ritt on 
compositional identities amongst polynomials. Our main result is an explicit 
description of the skew-invariant varieties, that is, those algebraic varieties 
X C AJi for which there is a field automorphism <r : C — + C with &(X) = X a . 
As consequences, we deduce a variant of a conjecture of Zhang on the existence 
of rational points with Zariski dense forward orbits and a strong form of the 
dynamical Manin-Mumford conjecture for liftings of the Frobenius. 

We also show that in models of ACFAo, a trivial set defined by <r(x) = f(x) 
for / a polynomial has Morley rank 1 and is usually strongly minimal, that 
the induced structure on this set is Ko-categorical unless / is defined over a 
fixed field of a power of a, and that nonorthogonality between two such sets 
is definable in families if / is defined over a fixed field of a power of a. 



1. Introduction 

Let fi, . . . , /„ G C[x] be a finite sequence of polynomials over the complex num- 
bers and let <& : Ag — > A£ be the map [x\, . . . , x n ) i— > (fi(xi), . . . , f n (x n )) given 
by applying the polynomials coordinatewise. We aim to explicitly describe those 
algebraic varieties X C Ag which are invariant under <£>. To do so, we solve a more 
general problem. We fix a field automorphism a : C — > C and then describe those 
algebraic varieties X C Ag which are skew-invariant in the sense that &(X) C X a 
and recover the solution to the initial problem by taking a to be the identity map. 

We consider this more general problem of classifying the skew invariant varieties 
in order to import some techniques from the model theory of difference fields and 
because we are motivated by some fine structural problems in the model theory 
of difference fields. Recall that a difference field is a field K equipped with a 
distinguished endomorphism a : K — > K. The theory of difference fields, expressed 
in the first-order language of rings expanded by a unary function symbol for the 
endomorphism, admits a model companion, ACFA, the models of which we call 
difference closed, and it is the rich structure theory of the definable sets in difference 
closed fields developed in [B] which we shall employ. 

In [12] the first author refined the trichotomy theorems of [6l [8] to show that 
sets defined by formulas of the form o~{x) = f{x) where / is a rational function are 
trivial unless / is covered by an isogeny of algebraic groups in the sense that there 
is a one-dimensional algebraic group G, an isogeny <f> : G — » G CT , and a rational 
function ir : G — > P 1 with / o it = 7r CT o <fi. In this context, triviality is a very 
strong property and, essentially, all algebraic relations amongst solutions to trivial 
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equations are reducible to binary relations. It is this consequence and the fact that 
the dynamical systems arising from isogenies are well-understood that we shall use 
to reduce the problem of describing general $-skew invariant varieties to that of 
describing skew invariant curves in the affine plane. 

Thus, the bulk of the technical work in this paper concerns the problem of 
describing those affine plane curves C C A^ which are (/, g)-skew invariant where / 
and g are trivial polynomials in the sense of the previous paragraph. Via some easy 
reductions one sees that this problem is really the same as that of describing those 
polynomials h for which there are polynomials 7r and p satisfying / o n = 7r CT o h and 
g o p = p a o h. Possible compositional identities involving polynomials over C were 
explicitly classified by Ritt in [17] and Ritt's work has been given a conceptually 
cleaner presentation and has been refined to give a very sharp answer to the question 
of which polynomials a(x) , b(x) , c(x) , d(x) £ C[x] satisfy a o b = cod by Mullcr 
and Zieve in |13j . Here, we perform a combinatorial analysis of Ritt's theorem to 
explicitly describe the possible h, tt, and p in terms of decompositions of / and g 
as compositional products. 

We find that there are three basic sources for skew-invariant curves. The easiest 
to see are those coming from (skew) iteration. If / is any polynomial and g = f a , 
then the graph of / on := f a of a o • • -of is (/, <?)-skew-invariant. In particular, 
when / = f a is fixed by a, the graphs of iterates of / (and their converse relations) 
are (/, /)-invariant. When / is expressible as a nontrivial compositional product, 
/ = g o h , then considering what we call a plain skew-twist of /, / := /i"oj, we 
see that the graph of h is (/, /)-skew invariant. In most cases, by computing one 
expression of / as a composition of indecomposable polynomials, one may explicitly 
and quickly describe all of the possible plain skew twists. Finally, it can happen 
that graphs of monomial identities (or their conjugates via some linear change 
of variables) may be (/, (^-invariant. For example, if f(x) = x ■ (1 + x 3 ) 2 and 
g(y) = y ■ (1 + y 2 ) 3 , then the curve defined by y 2 = x 3 is (/, (^-invariant. Our 
primary task is to prove a precise a version of the assertion that these examples 
exhaust the possibilities for skew-invariant curves. 

We apply our results on skew invariant varieties to address problems of two 
different characters. First, we use them to prove variants of two conjectures of 
Zhang 21j on the arithmetic of dynamical systems. In another direction, we use 
our results to resolve the question of definability of nonorthogonality between types 
containing a formula of the form <r(x) = f(x) for some polynomial / in ACFAo. 

In the case of the diophantine questions, we show that if if is a finitely generated 
subfield of C and $ : A^- — » A^- is given by a sequence of univariate polynomials 
each of degree at least two, then there are points a £ A n (K) with a Zariski dense $- 
forward orbit. In fact, we prove a somewhat stronger result in which some of the fiS 
are allowed to be linear. In another direction we prove a refined version of Zhang's 
Manin-Mumford conjecture for dynamical systems lifting a Frobenius. Again, the 
precise statement will have to wait, but we can note a special case. Suppose that q 
is a power of a prime p and that fix) £ Z[x] is a polynomial of degree q for which 
f(x) = x q (mod p) but / is not linearly conjugate to either x q itself or the q th 
Chebyshev polynomial and / is not a compositional power, then any irreducible 
variety X C Ag containing a Zariski dense set of n-tuples of /-periodic points must 
be defined by finitely many equations of the form Xi = Q for some /-periodic point 
Candxj ;=f° m (x k ). 
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In the case of differential fields, Hrushovski and Itai showed that there are theo- 
ries of model complete differential fields other than the theory of differentially closed 
fields [TO] • It is still open whether or not there are model complete theories of dif- 
ference fields other than ACFA, but if one could show that there were some formula 
9{x) defining in a difference closed field a set of D-rank I having only finitely many 
algebraic realizations such that for for every other formula rj(y, z) the set of parame- 
ters {b : some nonalgebraic type p{x) extending 9(x) is nonorthogonal to 77(2/, b)} 
is definable, then one could produce a new model complete difference field by omit- 
ting the nonalgebraic types in 9. We do not achieve this goal, but we do show 
that in characteristic zero for 9(x) given by o{x) = f(x) where / is a polynomial 
defined over the fixed field, for any rj(y, z) of the same form (ie r](y, z) is defined 
by &{y) = g(y, z) where g is a polynomial), then the set of parameters b for which 
9{x) and i](x, b) are nonorthogonal is definable. In fact, we show that even in 
the cases where nonorthogonality is not definable, the only real obstruction comes 
from graphs of the distinguished automorphism. A byproduct of this analysis is an 
explicit characterization of the algebraic closure operator on trivial D-rank I sets 
defined by a{x) = f(x), and the observation that this set is strongly minimal unless 
f(x) is skew-conjugate to x k ■ u(x) n for some polynomial u and some n > 1, and in 
any case has Morley rank I if u is non-constant. 

This paper is organized as follows. In Section [5] we lay out our conventions 
and notation. We begin with our technical work on polynomial compositional 
identities in Section [3l In Section |4] we completely describe the possible skew- 
invariant varieties as a consequence of theorems on the model theory of difference 
fields and the results of Section [3] In Section [5] we conclude with three applications 
of our results to definability of orthogonality, Zhang's conjecture on the density of 
dynamical orbits, and a version of the dynamical Manin-Mumford conjecture for 
Frobenius lifts. 

We thank M. Zieve for sharing a preliminary version of |I3j and for discussing 
issues around compositional identities of polynomials and rational functions. 

2. Notation and conventions 
For the most part, our notation is standard. 

Unless explicitly stated to the contrary, we work over an algebraically closed 
field L of characteristic zero which the reader may take to be C without loss of 
generality. Thus, for example, when we say "polynomial" without qualification we 
mean "polynomial with coefficients from L." 

With the exception of the results in Sect ion I5T31 for which the language of schemes 
is necessary, when we speak of algebraic varieties we really mean closed subvari- 
eties of affile space. As such, the reader is welcome to read for "variety" the 
phrase "subset of some A n (K) = K n defined by the vanishing of finitely many 
polynomials." Note that for us a variety need not be irreducible. We write G a for 
the additive group considered as an algebraic group and G m for the multiplicative 
group considered as an algebraic group. 

Recall that a difference field is a field K given together with a distinguished field 
endomorphism a : K — > K . On occasion, we shall endow L with a distinguished 
endomorphism a : L — > L making L into a difference field. We shall explain some 
results from the model theory of difference fields in Section [4l By the fixed field we 
shall mean {ieL: a(x) = x}. 
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If A is an algebraic variety over L, then X a is the cr-transform of X. Formally, 
X a is the base change of X from Spec(L) to Spec(L) via ct*. Less categorically, if X 
is given as a closed subvariety of some aifine space defined by the vanishing of some 
polynomials, then X a is defined by the vanishing of the polynomials obtained by 
applying ct to the coefficients of the polynomials defining X. In terms of rational 
points, if X C A£, then X a (L) = {(<r(ai), . . . , (r{a n )) : (ai,...,o«) G X{L)}. 
Likewise, if / : X — > Y is a morphism of varieties, then one obtains a morphism 
f a : X a -> Y a . 

Following Pink and Rofiler [16] . a a -variety is a pair (A,/) where X is an 
algebraic variety and / : X — > A CT is a morphism from X to the cr-transform of X. 
Some authors require / to be dominant and in practice we are really only interested 
in this case. When X a = X and f a = /, so that / : X — > X, we call this er-variety 
an algebraic dynamical system or an AD for short. A morphism tp : (A, /) — > (y, g) 
of cr- varieties is given by a morphism of varieties ip : X — > Y for which the following 
diagram commutes 

X — ^— > X a 
Y — - — > Y a 

A morphism ip : (A, /) — > (Y, gr) of ADs is also given by a morphism of varieties 
tp : X — > V for which a related diagram commutes 



A - 


f 


+ X 


*l 






Y - 


g 


> y 



When all of the objects in question are defined over the fixed field, a morphism 
of ADs is the same as a morphism of cr-varieties, but in general, these notions are 
different and it may happen that two ADs are isomorphic as a- varieties but not as 
ADs. 

Given an AD (A, /) we define (A, f° n ) by recursion with / o0 := idx and 
jo(„+i) f f°n_ jf ^ X jj ig a o-.yariet^ t h en we define /°" : A -> A "™ by 
recursion with /*° := idx and /Of 1 ^ 1 ) : = y 7 " If we need to discuss Carte- 
sian powers we might use the notation / x ™. That is, if / : X — > y, we might write 
/ x " : A x " ^y x " for the map { Xl ,...,x n ) h-> (/(^), . . . , /(a; n )). 

If (A, $) is a ct- variety, then we say that a subvariety Y C A is skew-invariant, 
or ^-skew-invariant if we wish to emphasize $, if $ maps y dominantly to y°\ 
Likewise, if (A, $) is simply an AD, then a subvariety Y of A, also defined over 
the fixed field of ct, is ^-invariant just in case $(Y) = Y. 

Note that it could happen that a variety is (skew-)invariant even though none 
of its irreducible components are. However, they will be (skew-)invariant for <J> ora 
(respectively, and ct"). In general, one might prefer to study those varieties 
for which <f>(Y) C Y a or for which $ \ Y : Y — » Y a is dominant. For the questions 
we study here, these distinctions are immaterial. 
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3. Polynomial compositional identities 

In this the longest section of the paper we recall Ritt's theorem on compositions 
of polynomials in detail, derive our theorems on the canonical sequences of Ritt 
swaps, and then use these results on canonical sequences of Ritt swaps to describe 
the possible (/, g)-skew invariant curves when / and g are polynomials of degree at 
least two. 

We should say a something about attributions before launching into the tech- 
nical details. Of course, the key result, Theorem 13.11 already mentioned, is due 
to Ritt [TJj- Julia and Fatou are usually credited with having founded the study 
of identities amongst iterations of rational functions, but Ritt's contemporaneous 
work contains stronger results and his theorem on compositional identities amongst 
polynomials stands as one of the highest achievements in algebraic dynamics. We 
take Ritt's theorem as basic and then deduce via a mostly combinatorial analysis 
(mixing in some elementary properties of polynomials) some canonical forms for 
recompositions of polynomials. The end result of Section 13.5.21 is an easy conse- 
quence of the main theorem of Miiller and Zieve in 13J and the proof there is much 
cleaner. The reader might well ask why we have bothered to retain this material. 
Our main goal is to describe the skew-invariant varieties for polynomial ADs. To 
do so we need to continue the combinatorial analysis to an investigation of skew- 
twists (see Section 13.6.11) and the internal structure of our combinatorial proof of 
the refinements of Ritt's theorem seems to be necessary. In private communication, 
Zieve suggested a method to deduce our final results from the main theorem of [13j , 
but as these arguments were at least as long as those presented here, we chose to 
follow our original approach. It may be the case that Zieve's arguments can be 
simplified and also given a geometric presentation in which case they might adapt 
well to rational functions or positive characteristic. 

3.1. Ritt's theorem. We begin by recalling the definitions needed for the state- 
ment of Ritt's theorem. 

Definition 3.0.1. A polynomial L is linear if there are A ^ and B, with L(x) — 
Ax + B. We say that L is a scaling if B = and write L — (-A). We say that L is 
a translation if A = 1 and write L = (+B). We write (— C) for (+(— C)). 

We use the word "linear" even though "affine" might be more appropriate. The 
linear polynomials correspond to the automorphisms of the affine line. In algebraic 
terms, the linear polynomials are the elements of the polynomial ring invertible 
under composition. 

Definition 3.0.2. A polynomial / is indecomposable if deg(/) > 2 and it cannot 
be written as a composition / = g o h of two non-linear polynomials g and h. 

We define decompositions to have indecomposable factors, since that is the only 
notion used in this paper. In a different context, one might wish to speak about 
decompositions some of whose factors are themselves decomposable, and then one 
might wish to call our decompositions complete. 

Definition 3.0.3. For / £ K[x] we say that the finite sequence (/&, . . . , fx) of poly- 
nomials fi is a decomposition of f if / = fk° • ■ ■ ° fx and each /$ is indecomposable. 
We often write / instead of . . . , fx). 
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As deg(goh) = dcg(g) -deg(h) > deg(g) , deg(h) for non-linear g and h, the degree 
of a polynomial acts as an analogue of a Euclidean norm. Thus, all nonlinear poly- 
nomials have decompositions. (Linear polynomials do not have decompositions.) 
Some polynomials admit many decompositions. For example, if a, b £ Z are distinct 
primes and f(x) = x ab , then (x a ,x b ) and (x b ,x a ) are two different decompositions 
off- 

Ritt proved [17j that these different decompositions are in some sense unique up 
to permutations. We need a few more definitions before we can state his theorem. 
We will then prove a stronger version of the "up to permutations" part. 

Definition 3.0.4. For n e Z+, the n th Chebyshev polynomial is the polynomial 
C n {x) 6 1\x\ defined by the relation 

x n + \ = C n {x + -) 

X X 

Other definitions of the Chebyshev polynomials appear in the literature. Note 
that with our definition, C n is monic. We have included the case of n = 1 for the 
sake of uniformity, but when we speak of Chebyshev polynomials, we will usually 
mean C n with n > 2. In fact, we will usually only consider C' p with p an odd prime. 

Definition 3.0.5. The following polynomials, when they are indecomposable (and 
this is an extra condition only in the last case), are called ritty. 

• Q(x) := x 2 

• P p (x) :— x p , p an odd prime 

• C p (x), p an odd prime 

• S'cM.n.'u) : = xk - u ( xe ) n where k ¥= 0> gcd(M) = 1, gcd(fc,n) = I, u(0) ^ 0, 
u is monic non-constant, and at least one of t and n is greater than one. 

Lemma 3.0.1. Any ritty polynomial a of the last kind has a unique maximal k, I, 
and n, and u is unique up to multiplication by nth roots of 1. 

Proof. Of course, the number k is the order of vanishing of a at 0. Let f(x) :— 
u{x t ) n and let £2 be maximal such that f(x) = g{x l2 ) for some polynomial g. This 
is the maximal I2 such that the zeros of / are a union of multiplicative cosets of 
the group of ^th roots of 1. Let ni be maximal such that f(x) = h(x) n2 for some 
h. This U2 is least common multiple of the multiplicities of roots of /. □ 

Definition 3.0.6. If / is a ritty polynomial of the last kind, we define the in-degree 
of f to be the maximal £, and the out-degree of f to be the maximal n from the 
last lemma. 

Remark 3.0.1. Occasionally, we write P2 for Q, but as the properties of Q are 
distinct from those of the monomials of odd degree, we generally treat Q and 
P p for p an odd prime separately. Occasionally, we write C2 for the Chebyshev 
polynomial of degree 2, but note that it is not ritty. Though we give a separate 
name to the Chebyshev polynomials, it is clear from the definition that they are 
odd functions, so for each odd p there is some u such that C p (x) — x ■ u(x 2 ); so 
they are in fact a special case of the last kind of ritty polynomials. 

Remark 3.0.2. The condition that u be monic is not necessary and does not appear 
in the original statement of Ritt's theorem. We will show in Lemma 13.1.11 that 
including this hypothesis does not affect the truth of that result. 
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Definition 3.0.7. The following identities involving ritty polynomials are called 
the basic Ritt identities. 

• P P o P q = P q ° P P for p^q 

• C p o C q = C q o C p for p q 

• Pp ° (Sk,£p,n,u) — SfcJ.pn.u ° -^p 

We formally exclude the tautological identities / o f = f o f from the first two 
kinds of basic Ritt identities because they are clearly useless, and because including 
them would make the combinatorial results and proofs even more cumbersome. In 
particular, Lemma \3 . 2 . 61 would not be true as stated. 

Other compositional identities involving indecomposable polynomials can be ob- 
tained from these by carefully inserting linear factors: if a o b — c o d and L, M , 
and N are linear, then 

(IT 1 o a o M) o (Ar 1 o b o N) = (iT 1 o c) o (d o N) 

Before giving a formal definition of this phenomenon, we note that it is even possible 
to obtain new compositional relations between ritty polynomials in this way. 

Remark 3.0.3. From the fact that each C p commutes with C^ix) — x 2 — 2, it is easy 
to see that C v :— (+2) o C p o (—2) is of the form C p (x) = x ■ u(x) 2 , so it is ritty; 
it is also easy to see that these commute with each other. Another example comes 
from using scalings L, M, and N on the basic Ritt identity C p o C q — C q o C p . 

Definition 3.0.8. Suppose that (fk, ■ ■ ■ , /i) is a decomposition and that 1 < i < k. 
If there are linear polynomials L, M, and N such that (L^ 1 o o M) o (A/" 1 o 
fi o N) = R o S is a basic Ritt identity, then (g^, . . . , gi) given by gi :— S o N~ x , 
gi+i = LoR, and gj :— fj for the other j < k is another decomposition of the same 
polynomial and we say that this latter decomposition is obtained from the former 
by a Ritt swap at i. 

In light of this and looking ahead to Ritt's theorem, we define 

Definition 3.0.9. An indecomposable polynomial / is swappable if there are linear 
polynomials L and M such that LofoM is ritty. In general, we say that polynomials 
/ and g are linearly related if there are linear L and M such that L o f o M = g. 

Remark 3.0.4. It is clear from the definitions that if some decomposition may be 
obtained from / by a Ritt swap at i, then one of the following must happen: 

• both fi and /j+x are linearly related to monomials 

• both fi and /j+i are linearly related to odd-degree Chebyshev polynomials; 

• fi is linearly related to a monomial P p and fi+i is linearly related to a ritty 
polynomial whose out-degree is a multiple of p; 

• /j+i is linearly related to a monomial P p and fi is linearly related to a ritty 
polynomial whose in-dcgrec is a multiple of p. 

Remark 3.0.5. In the definition of one decomposition having been obtained from 
another via a Ritt swap at i we suppress the auxiliary choices of the linear poly- 
nomials L, M, and N. At the level of decompositions, there is a real ambiguity 
attributable to these choices. For example, (x ■ (x + l) 5 ,a; 5 ) may be swapped to 
(x 5 7 x ■ (x 5 + 1)) taking L(x) — M(x) — N(x) — x. It may also be swapped to 
((32) 6 a; 5 , f • ^r") t akin S L {x) = (32) 6 x, M{x) = 32x, and N{x) = 2x. Although 
the factors in this decomposition are not monic, the basic Ritt identity used does 
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involve monic ritty polynomials. 

However, the two resulting decompositions are linearly equivalent, defined four lines 
below. Our first fundamental result, Proposition 13. f .51 is that at most one decom- 
position can be obtained from a given one by a Ritt swap at a given i, up to linear 
equivalence. 

Definition 3.0.10. The decompositions (fk, . . . , /i) and (gk, ■ ■ ■ ,gi) are linearly 
equivalent if there are linear polynomials . . . , L\ for which gk = fk ° L^—i, 

g% = ij" 1 ° fi ° for k>i> 1, and 5i = L^ 1 o fx. 

If (/&, . . . , fx) and (g/c, . . . , <7i) are linearly equivalent, then they are decompo- 
sitions of the same polynomial. Linear equivalence, as the name suggests, is an 
equivalence relation. 

Definition 3.0.11. We often write / for a decomposition (fk, . . . , /i); [/] is the 
linear-equivalence class of this decomposition. For a polynomial /, D/ is the set of 
linear-equivalence classes of decompositions of /. 

Ritt's theorem says basically that Ritt swaps act transitively on Dj. Our first 
fundamental result is that this slogan can be formalized. Here is our statement of 
Ritt's theorem: 

Theorem 3.1 (Ritt). OverC, any two decompositions of the same polynomial have 
the same length. Indeed, if f and g are decompositions of the same polynomial, then 
g is linearly equivalent to a decomposition obtained from f by a finite sequence of 
Ritt swaps. 

A diligent reference hunter will note that our definition of ritty polynomials 
differs from Ritt's in that we require u to be monic in the last case; this does not 
make a difference: 

Lemma 3.1.1. The notion of "Ritt sivap'' is not changed if in the last case of the 
definition of ritty polynomial we drop the requirement that u be monic. 

Proof. It suffices to show that if p is a prime, k, n, and £ are integers, and U is 
a polynomial, and x k ■ U(x pl ) n satisfies the requirements in the definition of ritty 
polynomial, then (x k -U (x l ) pn , P p ) is obtained by a Ritt swap from (P p , x k ■U(x pl ) n ). 
Let a be the leading coefficient of U. Set u(x) := x, fi(x) :— a n x, and X(x) := a pn x. 
Then A- 1 o P p o fx = P p and ii' 1 o(x k -U (x pl ) n ) ov = (x k -u(x pl ) n ) where u(y) := ^ 
is monic. The result is now clear. □ 

The next lemma shows that the relation u g is obtained from / by a Ritt swap 
at i" is invariant under linear equivalence. 

Lemma 3.1.2. If f, g, and h are decompositions of the same polynomial, g is 
obtained from f by a Ritt swap at i, and h is linearly equivalent to f, then there is 
a decomposition obtained from h by a Ritt swap at i and linearly related to g. 

Proof. Let Rk-i, ...,Ri, L, M, and N be linear polynomials witnessing our hy- 
potheses. That is, the Rs witness that h is linearly related to /: 
hk = fk° Rk-i, hj — RJ 1 o fj o Rj-i for 1 < j < k, hi = Rf l o fi and the other 
linear polynomials witness the Ritt swap: (L _1 o f i+1 o M) o (M^ 1 o/jo N) = T o S 
is a basic Ritt identity, gi := So TV -1 , gi+\ = L o T, and gj := fj for the other 
j < k. To simplify the notation, we define Rk(x) = Rq(x) = x. 
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Define V := o L, M' := R t 1 o M, and N' := R4-1 o N. It is now routine 

to check that this choice of L' , M', and N' witnesses that h admits a Ritt swap at 
i and that the resulting decomposition is linearly equivalent to g. □ 

To finish formalizing our slogan, that is defining an action of Ritt swaps on the 
linear-equivalence classes in Df, we need u g is obtained from / by a Ritt swap at i" 
to be a function rather than a relation. Our next lemma has a surface appearance 
of yet another elementary result about compositions of polynomials, but its proof 
while locally elementary is surprisingly complicated. In fact, it is the crucial result 
required to show that sequences of Ritt swaps form a monoid that acts on linear- 
equivalence classes of decompositions. This is one the two crucial results towards 
canonical sequences of Ritt swaps, which give the near-action of the permutations 
group. 

Lemma 3.1.3. If two decompositions h and g are both obtained from f by a Ritt 
swap at i, then h is linearly equivalent to g. 

Our proof of this crucial Lemma [3.1.31 occupies most of Section [3~2l Let us give 
a hint of the difficulties involved. Suppose there are two polynomials u and w, 
integers k, k' and p, and a field element A such that f\ :— (x k ■ u(x p )) o (+A) = 
x k ■ w(x p ). (We will show that such polynomials do not actually exist.) Then 
the decomposition (P p , fi) admits two Ritt swaps: one with L = M = N = id 
produces the decomposition {x k ■ w(x) p ,P p ); the other, with L — M = id and 
L = (—A), produces (x ■ u(x) p ,P p o (+A)). It is evident that these two are not 
linearly equivalent. As this example suggests, the bulk of our work will consist of 
classifying linearly related ritty polynomials. 

We start the proof of the lemma here, indicating precisely what we need to 
resolve. The next section is devoted to resolving it. 

Proof. Suppose that we can perform a Ritt swap at i in two different ways. That is, 
for j = 1 or 2 we can find linear polynomials Lj , Mj , and Nj and ritty polynomials 
Gj, Hj, Gj and Hj such that 

• Gj =Lj 1 o f l+1 O Mj 

• Hj = Mr 1 o ^ o Nj 

• Gj o Hj — Hj o Gj is a basic Ritt identity 

• 9i+i =L\oH\ 

• hi+i = L 2 o H 2 , and 

• hi = G 2 o N^ 1 . 

We are charged with finding a linear R for which (L\ o Hi) o R = (L 2 ° H 2 ) and 
R- 1 o (Gi o ATf x ) = G 2 o AT" 1 . 

Now then, if we set L := L^ 1 o L x , M := M^ 1 o Mi, and N := N^ 1 o N 2 , then 
we have L o Gi o M _1 = G 2 and M o Hi o iV _1 = H 2 . We claim that it is enough 
to find a linear R for which (L o Hi) o R = H 2 and Rr 1 o (Gi o N^ 1 ) = G 2 . Indeed, 
apply L 2 to the left of the first equation and N^ 1 to the right of the second. 
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It remains to analyze the possible ways to express ritty polynomials as linear 
polynomials composed with ritty polynomials. We carry out this analysis in Sec- 
tion 13.21 and complete this proof at the end of that section. Note that in light of 
Lemma [3.0.1[ we may assume that at least one of the linear factors is non-trivial. □ 

3.2. Linear relations amongst ritty polynomials. This section is devoted to 
characterizing linearly related ritty polynomials. We first reduce the study of lin- 
early related polynomials to the cases where the linear functions are both transla- 
tions or both scalings. 
First, some notation: 

Definition 3.1.1. We say that the polynomials / and g are scaling related if they 
are linearly related witnessed by scalings. We say that / and g are translation 
related if they are linearly related witnessed by translations. 

A special case of scaling related polynomials will be seen so often that it deserves 
its own notation. 

Definition 3.1.2. Given any polynomial / and nonzero scalar c, we define 

_ f(cx) 
c *l — ^i(f) 

Note that if / is monic, then c * / is also monic. And now, the reduction: 

Lemma 3.1.4. If f and g are linearly related ritty polynomials, then there is an- 
other ritty polynomial h which is scaling related to f and translation related to 



Proof. Since the group of automorphisms of the affine line is the semidirect product 
of the group of translations by the group of scalings, if / and g are linearly related, 
then we can find a third polynomial h which is scaling related to / and translation 
related to g. 

That is, if L(x) = Ax + B and M(x) = Cx + D are linear polynomials with 
L o / o M = g, then as L — (+B) o {-A) and M — (-C) o we may take 

h := {-A) o f o (-C) and then g = (+B) oho We need to show that h is itself 

ritty. 

Since h is translation-related to a monic polynomial g, h is itself monic. It is 
clear that if / and h are scaling related monic polynomials, then h = A * / for some 
A. 

It remains to show that a monic polynomial scaling related to a ritty polynomial 
is itself ritty: 

• A * P p = P p for every p, including p = 2. 

• As noted above, for p odd, C p being an odd function, fits into the next case. 

• Given a polynomial u and natural numbers k, £, and n, if w := (X e ) * u, 
then A * (x k ■ u(x e ) n ) = x k ■ w(x e ) n 

□ 



Before the last part of the above proof is forgotten, we exploit it to show that 
carefully inserting scalings into a basic Ritt identity produces another basic Ritt 
identity. We will prove something of a converse to this lemma in Lemma T3. 2.11 



POLYNOMIAL DYNAMICS 



11 



Corollary 3.1.1. Ifboa = docisa basic Ritt identity, and a and b are not both 
Chebyshev, then for any non-zero X, fx, there are non-zero v and rj such that 

(n * b) o (A * a) — (jj * d) o [v * c) 

is also a basic Ritt identity. 

Proof. Since a and b are not both Chebyshev, one of them must be a monomial. If 
both are monomials, the result is immediate as A * P p = P p for any p and non-zero 
A. 

If only a = P p is a monomial, then b must be of the form (x k ■ u(x l ) pn ). Then 
looking at the proof of the lemma, fj, * b = /i * (x k ■ u(x t ) pn ) = x k ■ w(x ) pn for 
w := (X 1 ) * u. Since (A * a) = (A * P p ) = P p , we see that 

(fi * b) o (A * a) = (x k ■ w(x e ) pn ) oP p = P p o (x k ■ w{x pl ) n ) 

Is a basic Ritt identity, and we may take rj = id, and v p — A. 

If only b = P p is the monomial, then {jj, * b) = b = c = (y * c) = P p for any v, so 
we might as well take v = id. Now a must be of the form (x k ■ u(x pt ) n ), so 

A * a = A * (x k ■ u(x pe ) n ) = x k ■ w{x pl ) n 
where w = \ pl * u. It is easy to check as above that r\ — X p works. □ 

Remark 3.1.1. In particular, if a and b are not both linearly related to Chebyshev 
polynomials, and (d, c) is obtained from (6, a) by a Ritt swap, then some (d,c) 
linearly equivalent to (c?, c) can be obtained from (b, a) by a Ritt swap witnessed 
by translations. 

We now return to characterizing linearly related ritty polynomials. With the 
reduction in Lemma 13.1.41 it is sufficient to separately characterize scaling-related 
ritty polynomials, and translation-related ritty polynomials. We have just done the 
first in the proof of the last Lemma l3.1.4| and we can immediately give the whole 
answer for monomials: 

Proposition 3.1.1. For all p including p = 2, P p is not linearly related to any 
ritty polynomial other than itself nor is it translation-related to itself. For every c 
and every p, c * P p = P p . 

Proof. Only the first of the three assertions is not immediate. It is true because P p 
is the only ritty polynomial of degree p with a unique 0. □ 

As we have noted that Chebyshev polynomials are a special case of the last kind 
of ritty polynomials, we are left with the following problem. 

Problem 3.1.1. When are two ritty polynomials of the form x k ■ u(x e ) n translation 
related? That is, we need to solve 

(+B) o (x k ■ u(x e ) n ) ° {+A) = (x k2 ■ u 2 {x i2 ) n2 ) 

The brunt of the work in this section is devoted to solving this. We make a 
few easy observations and summarize our results before diving into the necessary 
computations. 
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Remark 3.1.2. In light of Lemma r3.0.11 we are not interested in the case A = B = 0; 
evaluating both sides at shows that the case A = ^ B is impossible; thus we 
assume that A ^ and examine separately the cases where B = and where 
B ^ 0. 

The next two propositions are proved later in this section. 

Proposition 3.1.2. All solutions of Problem \S '. 1 . 1\ with B = are of the form 

(x l ■ (x - A) m ■ u{x) n ) o (+A) = (x + A) 1 ■ x m ■ u(x + A) n 

Proof. This follows imediately from Proposition 13 . 2 . II below and Lemma l3.1.5l □ 

Definition 3.1.3. A ritty polynomial of the form f{x) = x £ ■ (x — A) m u{x) n 
with both gcd(^, n) > 1 and gcd(m, n) > 1 is called a type J ritty polynomial. A 
polynomial linearly related to a type J ritty polynomial is called a type J swappable 
polynomial. 

Remark 3.1.3. Since a ritty polynomial is, by definition, indecomposable, if f{x) = 
x l ■ (x — A) m ■ u(x) n is a type J ritty polynomial, then gcd(£, m, n) = 1. Unless 
gcd(m,n) > 1, / is not a ritty polynomial; the condition that gcd(f, n) > 1 implies 
that / o (+A) is also a ritty polynomial. Since indecomposability and swappability 
are invariant under linear relatedness, a type J swappable polynomial is necessarily 
swappable and indecomposable. 

Remark 3.1.4. The representation of a type J ritty polynomial need not be unique: 
the polynomial 

1 {•:•<• ,1,;.H - " 

where pj are distinct primes, is translation-related to n different type J ritty poly- 
nomials. 

Remark 3.1.5. The existence of type J ritty polynomials rules out the possibility 
that in-degree and out-degree might be well-defined up to linear relatedness, and 
thus might be a property of a swappable indecomposable. However, we can at least 
say that the in-degree of a type J swappable indecomposable / is 1, in the sense that 
any ritty polynomial linearly related to / has in-degree 1. In light of Remark l3.0.4[ 
we can see that if a factor ft of a decomposition / is type J, then no decomposition 
may be obtained from / by a Ritt swap at i. 

Proposition 3.1.3. For each degree d and for each Ay^O, there is a unique triple 
of polynomials Ua, Va, Wa of degree d such that for some B,C =/= 

(+B) o (x ■ U A (x 2 )) o (+A) = x ■ V A (x) 2 

(+C) o (a; • V A {xf) o (-2A) = x ■ W A {xf 

Proof. This follows immediately from Propositions ^ . 2 ,41 and l3 . 2T31 and Lemma l3.1.5l 

□ 

Before explaining how these solutions come from Chebyshev polynomials, we 
make an easy observation 

Lemma 3.1.5. If f and g are ritty polynomials giving a solution to vroblem \3.1.1[ 
that is if {+B) o f o (+A) = g, then A* f is a solution of the same problem with 1 
in place of A: (+C) o (A * /) o (+1) = A * g. 
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So it is sufficient to characterize solutions for one particular A, as we do in the 
next proposition. 

Proposition 3.1.4. In Proposition \3UM x ' U-2(x 2 ) = C 2 d+i is the (2d + l)st 
Chebyshev polynomial. 

For any A, U- A = U A and W A = V-A = (-1) * V A . 
Further, (x ■ V- 2 (x) 2 ) o Q = Q o C 2d+1 . 

Proof. For odd n, both (+2) o C n o (—2) and (—2) o C n o (+2) are of the form 
x ■ v(x) 2 . Both come from the fact that C n commutes with C 2 (x) = x 2 — 2. Here 
is the reason for the first: 

O n oC 2 ~C 2 o C n 
C n o (-2) o Q = (-2) o Q o C n 
(+2) o C n o (-2) oQ = QoC n 
For the second, first observe that 

i * C 2 = -4((») 2 - 2 ) = -{-x 2 - 2) = x 2 + 2 

NOW C n oC 2 = C n O (• - 1) O (• - 1) O C 2 O (•«) O (• - *) = 

= (• - 1) O C n O (x 2 + 2)o(.-j) = C 2 oC„ 

So, bringing all outside linear factors to the right and introducing (—2) on the left, 

(-2) o C n o (+2) oQ= (-2) o (• - 1) o C 2 o C„ o (4) 
Now, [(-2)(- - 1) o C 2 ](af) = -{x 2 - 2) - 2 = -x 2 = [Qo (■ ± i)](a;), so 
(-2)C„ o (+2) o Q = Q o (• ± i) o C„ o (•«) = Qo(i* C n ) 

□ 

We now name these polynomials, and the solutions for other different A which 
are scaling-related to these. 

Definition 3.1.4. Continuing to use the notation from Proposition ^. 1 .31 C p (x) := 
x ■ V- 2 {x) 2 . For A + 0, 

C Pt \ := A * Cp and C Pt \ := A * C p 

There ritty polynomials are called a type C ritty polynomials. 

A (necessarily indecomposable and swappable) polynomial linearly related to a 
Chebyshev polynomial is called a type C swappable polynomial. 

Remark 3.1.6. A Ritt swap involving two monomials or two Chebyshev polynomials 
really swaps two factors of a decomposition. While the non-monomial factors on 
the two sides of the last kind of a Ritt swap are different, it is useful to think 
of one of them becoming the other while simultaneously swapping places in the 
decomposition with the monomial. 

To study indecomposables that might in this sense become type C or type J after 
a sequence of Ritt swaps, we introduce a formal notion and collect some taxonomy. 

Definition 3.1.5. Given two indecomposable polynomials a and b, the relation "a 
may become b after a sequence of Ritt swaps" is the transitive closure of the relation 
"there exist indecomposable c and d such that coa — bod or aoc — dob. 

Definition 3.1.6. (A list of different kinds of ritty polynomials) 
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• Monomials are self-explanatory. 

• Type C ritty polynomials C Pj a and C Pj a are defined above. 

• Type J ritty polynomials are defined above. 

• A ritty polynomial that is not type J but can become type J after a sequence 
of Ritt swaps is called a type coJ ritty polynomial. 

• A ritty polynomial that is not a monomial, type C, J, or coJ is called a type 
B ritty polynomial. 

• A (necessarily indecomposable and swappable) polynomial linearly related 
to a type X ritty polynomial is called a type X swappable polynomial, for 
X=C, J, coJ, B. 

Remark 3.1.7. The symbols C, J and B are not entirely arbitrary, but we chose to 
spare the reader from our eccentric naming conventions: C for Chebyshevichi, J for 
Janus (Jani in the plural), and B for boring. 

Definition 3.1.7. • A class C of swappable polynomials is closed under Ritt 

swaps if whenever fi £ C and g is obtained from / by a Ritt swaps at i 
(resp., i — 1), then gt + i £ C (resp., g^\ £ C). In other words, if a is in the 
class, and a may become b after a sequence of Ritt swaps in the sense of 
Remark l3.1.6l the b is also in the class. 

• A class of ritty polynomials is closed under Ritt swaps if the same holds 
under the additional assumption that gi+\ (resp., gi-i) is ritty. 

Theorem 3.2. The only ritty polynomials among the type C (respectively, J, coJ) 
swappable polynomials are the type C ( respectively, type J, coJ) ritty polynomials. 
These classes of ritty polynomials - monomials, types C, J, coJ, and B - are disjoint 
and closed under linear relatedness, and cover all ritty polynomials. 
The class of type C swappable polynomials is closed under Ritt swaps; so is the class 
of type J and coJ swappable polynomials. 

All translation relations among ritty polynomials are listed in vrovositions \3. 1JH and 

Ezra 

Proof. For the most part this theorem merely collects and translates into new no- 
tation the results of Propositions I3.1.3[ 13.1.41 and 13.1.21 above. One bit worth ex- 
plaining is that the classes of type C, type J, and type coJ swappables are disjoint. 
Type coJ ritty polynomials are not translation related to any other ritty polyno- 
mials, because their in-degree is too high for them to be type J, and their k is too 
high for them to be type C. In particular, type J or coJ ritty polynomials are not 
type C. The last part of the theorem is proved in Proposition 13.2.21 □ 

We now prove some useful consequences of this theorem, including Lcmma l3.1.3l 
We begin with something of a converse to Corollarv l3.1.11 as we use it in the proof of 
Lemma l3.1.3[ which in turn is used to prove Lemma |3 . 2 . 21 which looks very similar 
to Lemma T3. 2. II 

Lemma 3.2.1. If a and b are ritty and not both type C, and L, M , and N are 

linear, and (L o b o M _1 ) o (M o a o TV -1 ) = do c is a basic Ritt identity, then L, 
M , and N are scalings. 

Furthermore, there are ritty c and d such that boa = doc is another basic 
Ritt identity, which is linearly equivalent to the first one, and in particular (d, c) is 
linearly equivalent to (d, c) . 
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Proof. Since a and b are not both type C, one of them must be (linearly related to, 
and therefore equal to) a monomial. 

If a is a monomial, then M and N must be scalings, since monomials are not 
translation-related to any ritty polynomial. Then, since both b and Lobo M~ l are 
ritty and M is a scaling, L must also be a scaling because the equation in Problem 
13.1.11 has no solutions with B ^ = A. 

If b is a monomial, then L and M must be scalings. Since both a and (M oaoN^ 1 ) 
must be ritty, either TV is a scaling or both a and (M o a o TV -1 ) must be type J. 
But (Lobo M _1 , M o a o TV -1 ) is swappable, contradicting remark [3. 1.51 

"Furthermore" follows immediately from Corollary 13.1.11 □ 

Now we can finish the proof of Lemma 13.1.31 Afterwards, we must go back and 
prove the two propositions 13.1.21 and 13.1.31 

Proof. Let us recall the situation. We have ritty polynomials Gi, G%, Hi, H2, Gi, 
G 2 , Hi, and if 2 and linear polynomials L ,M, and N such that 

L o Gi o Af" 1 = G 2 and M a Hi o AT 1 = if 2 

and Gj o if,- = Hj o Gj are basic Ritt identities for j = 1 and 2. We need to find a 
linear polynomial i? such that 

L o Hi o R = H 2 and iT 1 o Gi o A^ 1 = G 2 

In fact, we show that R = M~ x always works, and that it is always a scaling. 
We consider separately the three cases that none, one, or both of Gi and ffi are 
monomials. Since G 2 is linearly related to G\, G 2 is a monomial if and only if G\ 
is, and if both are monomials, then G\ — G 2 , and similarly for Hi. 

(none) In this case, Gi = Gi and Hi = Hi are Chebyshev polynomials of odd 
degree, since commuting Chebyshevs are the only basic Ritt identity not involving 
any monomials. Then R = Af -1 works. (In fact, L = AI = N =(■ ± 1) in this 
case, as Chebyshevs are not non-trivially linearly related to themselves except via 
(-1) * C p = C p .) 

(two) In this case, Gi — Gi and Hi = Hi are monomials, since this is the only 
basic Ritt identity with two monomials on one side. Then R = A/ -1 works. (In 
fact, L, M, and N are scalings in this case, as monomials are not non-trivially 
translation related to themselves.) 

(one) This is done in Lemma l3.2.1i with b := Gi, a := Hi with one less assump- 
tion. □ 

This proof does not use the hardest part of our analysis: it suffices to know that 
Problem 13.1.11 has no solutions with A = 7^ B , and to have a characterization of 
solutions with B = ^ A, the type J ritty polynomials. Rest assured that we do 
need the full answer for the second crucial Lemma 13.2.61 

We end this section with a lemma closely resembling Lemma 13.2.11 because its 
proof is similar to the last two, though it will not be used until much later. 

Lemma 3.2.2. Suppose that a and b are ritty and neither is type C; and D, C , B 
and A are linear; and b := D o b o C and a :— B o a o A are ritty, and b o a — d o c 
is a basic Ritt identity. 
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Then D, B, and A are scalings; and there are a translation T , a scaling C , and 
a ritty b' := b o T such that bo C — b' o C' ; and for some d and c, b' o a = do c is a 
basic Ritt identity. Unless b is type J, T — id and b' = b. 

Proof. Since neither a nor b is type C, D and B must be scalings. If A is not a 
scaling, then a and a must be type J, but this contradicts remark [3. 1.51 So A is a 
scaling by A and a = A * a. 

Write C = T o C for a scaling C =: (•//) and a translation T. Then b' := boT 
is a monic polynomial scaling-related to the ritty b, so it is itself ritty. So if T ^ id, 
then b is type J, and in any case b = [i*b'. 

So (fi * b') o (A * a) = d o c is a basic Ritt identity. By Corollary 13.1. 11 there are 
•q and ^ such that b' o a = (77 * cf) o (y * c) is a basic Ritt identity. □ 

3.2.1. Zeros of f(x) — x k ■ u(x ) n and its derivative. We now turn to proving 
Propositions ^ . 1 . 2"l and l3 . Ol It may be interesting to note that the only consequence 
of the indecomposability of / we use here is that gcd(fc, I) — gcd(fc, rt) = 1, that is 
that no monomial is a compositional factor of /. 

Some of our preliminary computations appear in [3] . 

In this subsection we fix a ritty polynomial f{x) of the form f{x) — x k ■ u(x l ) n 
satisfying the requirements of Definition 13 . 0.51 and study its zeros as well as those of 
its derivative. We introduce several auxiliary polynomials and integer parameters 
and the association from / to these auxiliary objects is intended to be notationally 
uniform. 

The following notation remains in force only throughout this Section [3.2. II 
Notation/ Assumption 3.2.1. Given a nonzero polynomial g(y) we define: 

d(y) ■= n (y~ a ) 

{a:g(a)=0} 

g(v) ■= g{y)/g(y) 
g{y) ■= g'{y)/g{y) 

Let us return now to the case that f(x) = x k ■ u(x )" with the usual restrictions 
on the data defining /. As we will be studying the relations between / and other 
polynomials of a similar form, we fix fi (x) = x k2 ■ {x l2 )™ 2 . For each of the terms 
defined for /, we have a corresponding term defined for f% noted with a subscript 
of "2." 

Throughout this section we write s :— deg(u) and t := deg(u). We define two 
more associated polynomials 

v(y) := uiyY 1 - 1 ■ u(y) 

w(y) := ku(y) + n£yu(y) 

Note that every zero of v is also a zero of u while w shares no zeros with u (and, 
hence, with v). Note also that deg(w) = (n — l)s + (s — t) = ns — t and deg(w) = t. 
A simple calculation shows that 

f'(x) = x fe - 1 ii(a; £ ) ri - 1 u(^)[fcu(^) + nex e u(x £ )} = x k ~ x ■ v(x e ) ■ w(x e ) 

Thus, the zeros of /' may be described as follows. 
(1) If k > 1, then f(0) = and ord (/') = k - 1. 
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(2) Each £ th root of a zero of v is a zero of both / and /'. Hence, counting 
multiplicity, there are £(ns — t) zeros of f'(x) arising as £ th roots of zeros 
of v. 

(3) There are it zeros of /' arising as the £ th roots of zeros of w. None of these 
are zeros of /. 

From the above calculation, one sees that in some sense most of the ramification 
of / occurs above zero. (When we say that a zero a of /' lies above b, we simply 
mean that /(a) — b.) 

Note that the zeros of /' of the last two kinds come in batches of size £ consisting 
of a (multiplicative) translate of the group of roots of unity over order dividing I. 
Moreover, if a and b are zeros of the third kind and are in the same batch, then 
they lie above different points. Indeed, write b = (a ^ a for some i th root of unity 
£. We know that w(a l ) = implies that u(a e ) ^ 0. As gcd(fc,£) = 1, we have 
f(b) = ((a) k u((( a y) n = ( k a k u{a e ) n ^ a k u{a e ) n = f(a). 

Let us note some numerical consequences of these observations. 

Observation 3.2.1. With the notation given above: 

• f/= ramification points of / lying above zero = (k — 1) + £(ns — t) 

• f/= ramification points of / lying above other points = it and this set of 
ramification points is a union of t cosets of the group of i th roots of unity. 

Let us return to the equation 

(1) (+B) o (x k ■ u(x e ) n ) o (+A) = x k2 ■ u 2 {x e2 ) n2 

As the linear operations are invertible, we see that if /2(a) = /2(b), then f(a + 
A) = f(b + A). 

Differentiating Equation [1] we have 

f'oA = f 2 

Hence, for any point a, we have ord a — ord a+ A f ■ That is, (+A) translates 
the zeros of the derivative of f 2 onto the zeros of the derivative of / respecting 
multiplicities. If B — 0, then the ramification of f 2 above zero is matched precisely 
with the ramification of / above zero. If B ^ 0, then the ramification of / above 
one other point is matched with the ramification of f 2 above zero and vice versa. 
It is this consequence which makes these seemingly trivial observations useful. 

Lemma 3.2.3. If £ > 1, then the sum of the roots of f is zero as is the sum of the 
roots of f . 

Proof, f has a zero at which contributes nothing to the sum while the rest of its 
zeros are the i th roots of the zeros of u. For each zero c of u pick one £ th root b. 
Then the others have the form (b for some £ th root of unity. As the sum of the £ th 
roots of unity is zero, the sum over all these roots is zero. From our expression for 
f'(x), we see that it, too, may be expressed as a power of x times a function of sr. 
Hence, by the same reasoning the sum of its roots is zero. □ 

Corollary 3.2.1. If A ^ and £ > 1, then the roots of g(x) := (x k ■ u(x e ) n ) o (+A) 
do not sum to zero, and neither do the roots of h'(x) where h(x) = (+B) o (x k ■ 
u(x e ) n )o(+A). 
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Proof. Combining the above observations, we see that the sum of the roots of g is 
A ■ deg(g) while the sum of the roots of h'(x) is A ■ (deg(h) — 1) (which is not zero 
as deg(ft) > 3). □ 

The following Corollary already appears in [3]. 

Corollary 3.2.2. In the Equation^ one of £ and £2 must be 1. 

Proof. Use the previous two results, remembering that we are only interested in the 
case A ^ 0. (Evaluating Equation[T]at shows that ^4 = implies that B = 0.) □ 

Notation/ Assumption 3.2.2. In light of Lemma [3.1.51 above, we may and do 
assume that A = 1 from now on. 

In light of the last result, we may and do assume that £ = 1 from now on, breaking 
the symmetry of the two sides of the equation. 

In the next section, we address the possibility that B = 0; after that, we return 
to the other, significantly more difficult possibility. 

3.2.2. isolating type J. In this section, we find all solutions to the equation 

(x k ■ u{x) n ) o (+1) = (x k2 ■ u 2 {x l2 ) n2 ) 

that is, the special case of Problem 13.1.11 where B = or, equivalently, where 
u(l) = 0. In light of Lemma T3.1.51 we have set A = 1 from Problem l3.1.1l We just 
proved that we may assume without loss of generality that £ — 1 , which is why it 
is missing. 

Lemma 3.2.4. Suppose that f(x) := (x k ■ u(x) n ) and f 2 (x) := (x k2 ■ u 2 (x i2 ) n2 ) 
and f(x + 1) = f 2 {x). Then £ 2 = 1. 

Proof. Since f 2 {x-l) = f(x), we see that k = ord (/) = ord_i(/ 2 ). As (-l) k2 ^ 0, 
it follows that —1 is a fc-fold zero of u 2 {x l2 ) n2 ). Then any other £ 2 nd root £ of (— l)^ 2 
is also a /c-fold zero of f 2 (x). Thus unless £ 2 = 1, some (£ + 1) ^ is an exactly 
fc-fold zero of /, and therefore of u(x) n . Hence, n divides k. As gcd(n, k) = 1, it 
follows that n = 1. But £ = n — 1 so that / is not ritty. With this contradiction 
we see that £2 = 1- D 

Thus we only need to solve the equation (x k ■ u(x) n ) o (+1) = [x k2 ■ u 2 (x) n2 ), 
with = 0. 

Lemma 3.2.5. If u{\) — and the equation 

(x k ■ u{x) n ) o (+1) = (x k2 ■ u 2 (x) n2 ) 

holds, then there are integers m and m 2 and a monic polynomial U such that 
k = m 2 n 2 , k 2 = mn, u(x) = (x - l) m • U(x) n2 , and u 2 {x) = (x + l)™ 2 • U(x + 1)". 

Proof. We have k 2 = ordo f 2 = ordi / implying that n divides k 2 . Setting m :— 
k 2 /n we see that ordi u — m. Likewise, k — ord / = ord_! f 2 so that n 2 divides 
k. Write m 2 := k/n 2 and note that ord_! u 2 = m 2 . 

Write u(x) = (x — l) m P(x) and u 2 (x) — (x + l)'" 2 P 2 (x). Then, our equation is 

(x + l) m2 " 2 • [x m P(x + 1)]" = x mn ■ [(x + l) m2 P 2 {x)] n2 

Cancelling (x + l)™ 2 ™ 2 ■ x mn , we obtain 

P{x + l) n = P 2 (x) n2 
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Recalling that k = 77,27772 and n are relatively prime, so that gcd(rt, n 2 ) = 1, it 
must be that P is an n 2 power and P2 an 77 th power. Write P(x) = U(x) n2 and 
P 2 (x) = U 2 (x) n . As U(x + l) n ™ 2 = U 2 {x) nn \ if we take these polynomials to be 
monic, then we have U(x + 1) = U2(x). □ 

Together, these lemmata prove the Proposition 13.1.21 above and vindicate Defi- 
nition 13.1.31 

Proposition 3.2.1. All solutions to the equation 

(+B) o (x k ■ u(x e ) n ) o (+1) = (x k2 ■ u 2 (x e2 ) n2 ) 
with u(l) — are of the form 

{x m2n2 ■ {x - l) mn U(x) n2n ) o (+1) = x mn ■ (x + l) 1 " 2 " 2 • U{x + l)" 2 " 
for some monic polynomial U and integers m and 7772 satisfying gcd(mn, 712) = 

gcd(TO 2 772, 77 ) = 1. 

3.2.3. B ^ 0, a first reduction. We are left with the possibility that u(l) ^ in 
the equation in Problem l3.1.1l Recall that we have reduced to the case that A = 1 
(Lemma 13. 1.5|l and I = 1 (Corollary I3.2.2|l . Thus we need to solve the equation 

{+B) o (x k ■ u{x) n ) o (+1) = (x k2 ■ u 2 {x e2 ) n2 ) 

when /3 ^ 0. This section is devoted to showing that n = 77,2^2 = 2. 

Then, the following two sections examine the two possibilities, ri2 — 2 — 2^2 and 
^2 = 2 = 2ti, 2 . 

Proposition 3.2.2. If u(l) ^ and the following equation holds 
(+B) o (x k ■ (u(x e ) n )) o (+1) = {x k2 ■ u 2 {x t2 ) n2 ) 

then k = k 2 = 1, s = t, s 2 = t 2 (that is, all roots of u and of U2 are simple), and 
n = n 2 £2 = 2. 

Proof. Note that tt(l) ^ if and only if B ^ 0. 

Recall that for any £ we have ord^ f 2 — ord^ + i /' and that if £ ^ and f^(0 = 0, 
then for any £ with ( l2 — 1 we have ord^ f 2 . Recall our observation that as long as 
/2(C) 7^ and C 7^ 1 is an root of unity as above, f 2 (0 7^ /2(C£)- ^ n particular, 
for each of the batches of ramification points of f 2 described in in Observation 13. 2.11 
there is at most one which lies above B. Translation by 1 takes the ramification of 
f 2 to ramification of / and it takes the points lying above B to point lying above 
0. Thus, 

• at most one point in each batch of the ramification of f 2 above a point 
other than zero goes to the ramification of / over and 

• the rest of the ramification points of f 2 must go to the ramification of / 
above other points. 

Translating into inequalities involving the various degrees of / and f 2 using the 
calculations in Observation 13. 2 . II and recalling that £ = 1 we have the following 

• (k - 1) + (77s - 1) < t 2 

• (k 2 - 1) + l 2 {n 2 s 2 - t 2 ) + (£ 2 - 1)* 2 < t 
Adding and collecting ts on the right, we have 

(k 2 - 1) + (k - 1) + ^™ 2 S2 + ns <2t + 2t 2 
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Recall that t < s, ti < S2, n = n£ > 2, ^2«2 > 2, k > 1, and &2 > 1- Thus, all of 
these inequalities must be equalities. □ 

Since the numbers n 2 and £2 are positive integers, the inequality £ 2 n 2 = 2 implies 
£ 2 = 1 and ri2 = 2 or €2 = 2 and U2 = 1. 

3.2.4. uniqueness of solutions with B ^ 0. The next two propositions prove Propo- 
sition 13.1.31 

Proposition 3.2.3. There is exactly one u of each degree satisfying 

(+B) o (x ■ u{xf) o (+1) = (x ■ u 2 {x) 2 ) 
Proof. Consider the equation 

(2) (+B) o (x ■ u{xf) o (+1) = • u 2 {x) 2 ) 

with B ^ and w and u 2 having only simple roots. Since B ^ 0, it follows that 
+ 1) and u 2 (x) are coprime. Differentiating Equation [2] and using the notation 
from the beginning of this section we have 

(3) v{x + l)w(x + 1) = v 2 (x)w 2 (x) 
or 

(4) u(x + 1) • (u(x + 1) + 2(x + l)u'(x + 1)) = u 2 {x) ■ (u 2 (x) + 2xu 2 (xj) 

From the coprimality of u(x + 1) and u 2 (x) and noting the leading coefficients 
we obtain two equations. 



(5) (2s + l)u 2 (x) = u(x + 1) + 2{x + l)u'(x + 1) 

(6) (2s + l)u(x+l) = u 2 (x) + 2xu' 2 (x) 
Differentiating Equation [5] again we get 

(7) (2s + l)u' 2 {x) = 3u'(x + 1) + 2(x + l)u"(x + 1) 



Multiplying Equation[6]by (2s+l), and then using Equations[5]and[7]to eliminate 
u 2 and it 2 , we obtain 

(8) (2s+l) 2 w(x+l) = u{x+l) + 2(x+l)u' (x+l)+2x(3u' (x+l)+2(x+l)u" {x+1)) 
Collecting terms, we see that u(x + 1) must satisfy the following ODE 

(9) (2s 2 + 2s)r + (3 - 4z)Y' + 2{z - z 2 )Y" = 

A routine calculation shows that if u(x + 1) is a solution to Equation [5] and we 
define u 2 {x) via Equation[5]and set B := — u(l) 2 , then these data satisfy Equation^ 

The linear differential operator L = 2{z — z 2 )d 2 + (3 — 4z)d + (2s 2 + 2s) defines 
a linear operator on the s + 1-dimensional space of polynomials of degree s. With 
respect to the standard monomial basis of this space, the matrix M = (Mij) of L is 
upper triangular. On the main diagonal, we have M iyi = 2(1 — i)i — Ai + (2s 2 + 2s) = 
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(2s 2 + 2s) — (2i 2 + 2i) and just above the diagonal we have M^j+i = (i + 1)(3 + 2i). 
In particular, M SjS = so that rank(L) < s whilst the (s, s)-minor is invertiblc. 
Thus, the rank of L is s and the dimension of the space of solutions to Equation [5] 
is exactly one. As we require u to be monic, there is exactly one solution of degree 
s. □ 

Proposition 3.2.4. For each positive integer s, there is a unique monic polynomial 
u of degree s and nonzero parameter B for which there is another monic polynomial 
U2 satisfying 

(10) (+B) o (x ■ u{x) 2 ) o (+1) = (x ■ u 2 {x 2 )) 

Proof. As before, since B ^ 0, u(x + 1) and u 2 {x 2 ) are coprime. Differentiating, 
we obtain 



(11) u(x+l)-(u{x+l)+2(x+l)u' {x+l)) = u 2 {x 2 ) + 2x 2 u' 2 {x 2 ) = {u 2 + 2x-u 2 )oQ 

The zeros of the righthand side of Equation ITT1 come in ±-pairs. We claim that 
for each such pair one is a root of u(x + 1) and the other is a root of (u(x + 1) + 
2{x + l)u(x + 1)). Indeed, it cannot happen that u(c + 1) = and u(— c + 1) = 
for Equation [TUl would yield cu 2 (c 2 ) = B = — cu 2 {{— c) 2 ) = —cu 2 {c 2 ) contrary to 
the fact that B ^ 0. Thus, at most one of each pair of roots of the righthand side 
is also a root of u(x + 1). As the degree of the righthand side of Equation QT] is 
twice that of u, it follows that at least one root from each pair must be a root of 
u(x + 1). Matching leading coefficients, we conclude: 

(12) (-l) s (2,s + l)u(x + 1) = (u(-x + 1) + 2(-x + l)u'(-x + 1)) 

Substituting z := —x+l, we see that u satisfies the following difference-differential 
equation: 



(13) = u{z) + 2zu\z) - (2s + l)(-l) s u(2 - z) 

The difference-differential operator in Equation [13] is a linear operator on the 
space of degree s polynomials and it is given by an upper triangular matrix relative 
to the standard monomial basis. The entries along the main diagonal are 

l + 2i-(-l)' 1+s (2s + l) 

Hence, the rank of this operator is exactly s meaning that there is a unique monic 
solution. □ 

We are now done with linearly related ritty polynomials, and the proof of the 
crucial Lemma 13.1.31 

3.3. The Ritt monoid. With Lemma[3TL3]in place we can describe the sequences 
of Ritt swaps as a monoid acting on the set of linear-equivalence classes of decom- 
positions. Unfortunately, we will need to return to some messy computations to 
verify that this action works as required. 
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Definition 3.2.1. For a non-linear polynomial / we let Df be the set of distinct 
decompositions of / up to linear equivalence. That is, Df := {[fk, ■ ■ ■ , fi] ■ f = 
fk° ■ ■ ■ ° fi, each fi indecomposable } where [/&,..., is the equivalence class of 
the decomposition with respect to linear equivalence. 

For the remainder of this section we will work with a fixed polynomial / admit- 
ting a decomposition of length k. By Ritt's theorem, all decompositions of / have 
length k. 

Definition 3.2.2. Let be the free monoid on the {k — 1) generators t\, . . . , tk-i- 

Our work from the previous section allows us to define an action of M/~ on Df 
whereby U acts by converting a (linear equivalence class of a) decomposition into 
the (linear equivalence class of a) decomposition obtained by a Ritt swap at i, if 
possible. A word in the Us (that is, an element of Mk) corresponds to a sequence 
of Ritt swaps. This action is very close to the action of the symmetric group Sk 
with ti identified with the transposition (i i + 1). 

Definition 3.2.3. We define an action of M k on Dj := Df U {oo} as follows. If 
some decomposition h may be obtained from / by a Ritt swap at i, we say that 
ti * [.f] = [h]; if no decomposition may be obtained from f by a Ritt swap at i, we 
say that ti ★ [/} = oo; and t , * oo = oo for all i. 

For w <E Mk and [f] £ Df we say that w*[f] is defined if w * [f] ^ oo. 

We often abuse notation writing w * / = ~~g for w ★ [ / ] = [if] . 

Ritt's theorem may be restated as saying that this action is transitive on Df. 

Our goal now is to show that this action is more or less the same as an action 
of the permutation group Sk on a set with k elements. Of course, this cannot be 
literally true, but the result is close enough for our purposes. 

Fact 3.2.1. The following is a presentation of Sk, the symmetric group on k letters. 
Generators: 

(i i + 1) for 1 < i < k 

Relations: 

(ii + lf = \& 
(i i + l)(j j + 1) = (j j + l)(i i + 1) /or j^i±l 
(i i + + 1 i + 2){i i + 1) = (i + 1 i + 2)(i i + + 1 i + 2) 

To make the connection more precise, we define 

Definition 3.2.4. The permutation represented by a word t ar . . .t a2 t ai in Mk is 
(a r + 1 a r ) . . . (a 2 + 1 a 2 )(ai + 1 ai). 

Our fundamental result is that our action satisfies the relations satisfied by Sk- 

Lemma 3.2.6. For any f G Df and i < k 

• Ifti*f is defined, then tf * f = f . 

• For j ^ i± 1, titj * f = tjti * f . In particular, one is defined if and only 
if the other is. 

• titi+iU* f = ti+iUti+i * f . In particular, one is defined if and only if the 
other is. 
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The first two parts are immediate from the definition. The last part will be 
proved in the next section. It is clear that the qualifier 'if defined' is necessary 
in the first clause of the lemma. In most cases this rules out a true true action 
of the permutation group on D/, i.e. the possibility that any two words in 
representing the same permutation act the same way. However, we will show in 
Corollary 13.2.31 that if two words w and w' represent the same permutation and 
both w * / and w' -k f are defined, then they are equal. 

3.4. Descalings, double-jumps, and the crucial combinatorial lemma. The 

main aim of this section is 

Proposition 3.2.5 (Crucial combinatorial lemma). For any decomposition f and 
for any i, titi+iti* f — tj+ifjfj+i* / , i.e. that one is defined if and only if the other 
is, and when both are defined, the result is the same. 

The second part follows immediately form the results of Muller and Zieve [13] , 
but we do not see how to obtain the first part without our intermediate results. 
We also use descalings and double-jumps, defined and described in this section, in 
section 13.6.11 

3.4.1. Descalings. What we do here is overkill for the Crucial Combinatorial Lemma 
at hand, but we will need it in the proof of Theorem 13. 3i and it makes sense to 
develop the ideas here. 

Definition 3.2.5. Given a decomposition /, we deine a descaling of f to be a 
/c-tuple (Mi, hi, Li)i<k of triples where hi are ritty, and linear Li and Mi are trans- 
lations except possibly for and L\, such that ((Mp. o hk °-£fe)> • • ■ (-^l oh\oL\)) 
is linearly equivalent to /. 

Lemma 3.2.7. If all factors f t in a decomposition f are swappable, then f admits a 
descaling (Mi, hi, Li), in which we may choose, for each 1 < i < k, to have Mi = id 
or Lj+i = id, and also choose to have one of and L\ to be a translation. 

Proof. We sketch the proof for making L\ a translation and Mi = id for all i < k. 
Other options are identical. First, find ritty gi and linear Bi and Ai such that 
fi = Bi o gi o Ai. To make L\ a translation, write A\ = (-A) o L\, let hi := \ * gi. 
Now / is linearly-equivalent to (/&,... /2 ° (•A dcg ^ 1 - ) ), hi o L\). Inducting on k 
finishes the proof. □ 

Definition 3.2.6. • A left descaling of a decomposition is a descaling (Mi, hi, Li) 

where M, = id for all i ^ k and L\ is a translation. 

• A left descaling (Mi, hi, Li) has no loose translations at j if neither hj nor 
hj + i is type C, and if hj + i is type J, then Lj+i = Mj = id. 

• A left descaling (Mi, hi, Li) has no essential translations at j if neither hj 
nor hj+i is type C, and Lj+i ° Mj = id. 

• A left descaling (Mi, hi, Li) has no loose (resp. essential) translations if it 
has no loose (resp. essential) translations at j for any 1 < j < k. 

• If (Mi, hi, Li) is a descaling of /, then a compatible descaling of tj * / is 
(Mi, h'i, Li) where h\ = hi for all i j,j + 1, and Lj+i = Mj = id, and 
hj +1 o hj = hj + \ o hj. 

With two compatible descalings, the linear factors witnessing the Ritt swap may 
be taken to be identity. 
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Lemma 3.2.8. If a left descaling (Mi, hi, Li) of f has no loose translations at j 
and tj * f is defined, then (Mi, hi, Li) has no essential translations at j and tj * f 
admits a descaling compatible with this Ritt swap. 

Proof. Let us unwrap the definitions. Since j < k and this is a left descaling, we 
are assuming that Mj = id. We are assuming that neither hj + i nor hj is of type C, 
and that if hj + \ is type J, then Lj + \ = Mj = id. We need to show that Lj +1 = id, 
and that there are ritty gj+i and gj such that <?j+i ° Qj = hj+i o hj is a Ritt swap. 

We first show that Lj + i = id. We already know this when hj + i is type J, so we 
assume it is not. So we have ritty hj+i and hj and a translation Lj + i such that 
(hj+\ o Zj+i, hj) is swappable, but hj+\ is neither type C not type J, and hj is not 
type C. This is clearly impossible unless Lj + i = id. 

Now neither hj+i nor hj is type C, and {hj+i, hj) is swappable, and lemma [3.2. II 
finds the desired ritty gj+i and gj for us. □ 

Proposition 3.2.6. Suppose that (Mi, hi, Li)i< k * s a left descaling of f which has 
no loose translations, let v := (tk—itk—2 ■ ■ -~ti) an d suppose that v * f is defined. 
Then (Mi, hi, Li) has no essential translations, and v * f admits a left descaling 
with no essential translations, compatible with the Ritt swaps in v. 

Proof. Since t\ ★/ is defined and (Mi, hi, Li) has no loose translations at 1, the last 
lemma says that it has no essential translations at 1 and t\ * / admits a compatible 
descaling (Mi, hi, Li). That is, (M u hi, Li) = (M k o h k o L k , h k -i o L k -i, ...h 3 o 
L 3 , hi, hi o L\) and there are ritty gi, gi such that M k o h k oL k , h k —i oLk—i, ■ ■ ■ h% o 
L 3 , gi, gi ° L\) is a descaling of t\ * /, and hi ° hi — gi o g\. Note that this new 
decomposition is still a left descaling. Since neither hi nor h\ were type C, and 
the class of type C ritty polynomials is closed under Ritt swaps, the new ritty 
factors gi and g\ are not type C. Also, this new descaling does not have any loose 
translations: at all j ^ 1, it inherits this property from the old descaling, and at 1 
is has no translation as required. 

Thus, the new decomposition satisfies the hypotheses of this proposition, so we 
can apply the lemma again to show that L3 = id and obtain ritty g 3 and gi such 
that g 3 o g 2 = h 3 og 2 and (M k oh k o L k , h k -i ° L/c-i, ...h 4 oL 4 , g 3 ,gi,gi o L x ) is a 
descaling of tit-y * f, again satisfying the hypotheses of the proposition. 

Repeating this process (k— 1) times we end up with a descaling (M k og k , g k _i, . . . gi,g\o 
Li) of v -k f that satisfies the conclusions of the Proposition. □ 

Lemma 3.2.9. Again, let v := (tk—itk—2 ■ ■ ■ ti) an d suppose that v * f is defined. 
If no factor of f is type C, then f admits a left descaling with no loose translations. 

Proof. We showed before that / admits a left descaling (M k o h k o L k ,h k -\ o 
Lfc_i, . . . h% o Li, hi o Li). We would like to show that for each j > 2, if hj is 
type J, then Lj = id. Instead, we will show that if there is a left descaling of / 
where this first fails at j, then there is another left descaling of / where this first 
fails at j + 1. Thus, we will show by induction that there is a left descaling where 
this never fails. 

To start from the beginning, suppose this fails at 1, so that hi is type J but 
Li =/= id. However, ti*f is defined, so (hioLi, h±oL±) is swappable. Since hi is type 
J, hi must be a monomial, so h' 2 := h% o L 2 must itself be ritty. So we have found 



POLYNOMIAL DYNAMICS 



25 



a new left descaling of /, namely (Mk ° hk ° Lk, hk-i ° Lk-i, ■ ■ ■ ha o L3, /i 2 , /ii o Li) 
with no loose translations at 1. It is clear that we can continue inductively. □ 

3.4.2. Double-jumps and the proof of the crucial combinatorial lemma. We first 
make a few observations about words of the form ti^+i or tj+itj, called double- 
jumps. Most of the time, their action is undefined. 

Definition 3.2.7. A double-jump a sequence of two Ritt swaps of the form tjtj+i 
or ti + \ti. 

Since only three compositional factors are involved in a double-jump, it is suffi- 
cient to characterize the case i = 1. 

If ji o I o r = I o j o r = Z o f o j r witnesses the fact that t\ti * (ji, I, r) = (2, f, j,-) 
then we say that double-jumps (I, r) from the left and so (I, r) is double-jumpable 
from the left. 

Reading the same equation right-to- left, ji o I o r = I o j o r = I o f o j r witnesses 
the fact that £2^1 * (h^ijr) — (jul> r )> we sav that j r double-jumps (l,f) from the 
right, and so (l,f) is double-jumpable from the right. 

Remark 3.2.1. In order for a double-jump to be defined, all of the factors above 
must be swappable, i.e. linearly related to a ritty polynomial. 

Lemma 3.2.10. If in the middle of a double-jump j is an odd Chebyshev polyno- 
mial, then (I, j, r) is linearly equivalent to (Ao C p ,C q , C r o B) for some prime, not 
necessarily odd, p and r and some linear A and B. ( We allow the possibility that 
p = 2 and/or r = 2.) 

Proof. We already know that I and r must be linearly related to (possibly degree 2) 
Chebyshev polynomials, so we may write (I, j, r) — (A' oC p o A" , C q , B" o C r ° B'). 
Our purpose is to get rid of A" and B" . We show how to get rid of A" and assure 
you that the other half of the proof is exactly the same. Since is swappable, 
there are linear L, M, and N such that L _1 oA'oC p o A" o M and M _1 oC q oN are 
ritty and [L^ 1 oA'oC p o A" o M] o [M^ 1 o C q o N] is one side of a basic Ritt identity. 
If p is also odd, the basic Ritt identity must be C p o C q = C q o C p . Since odd 
Chebyshev polynomials are not non-trivially linearly related to themselves except 
for (-1)*C P = C p , we must have L~ x oA' = A"oM = (-±1) and M~ x = N = (-±1), 
in particular making A" = (• ± 1), as wanted. 

If p = 2, then the basic Ritt identity must be Q o C q ,\ = C q ,\ o Q for some A. 
Then M~ x = (-/i) where p, = and r 1 oi'oC 2 o A" o M = Q. 

Write A" = (+-D) o {-v) and apply to both sides to get {-^v 2 ) o oi'o 
C 2 o (+D) = Q. Therefore, D = 2, and 

A'oC 2 o A" = A 1 o C 2 o (+2) o = A 1 o (+2/? - 2) o (-^i 2 ) o C 2 

So we let A — A' o (+2/i 2 — 2) o (-/x 2 ) and obtain the desired conclusion. □ 

Lemma 3.2.11. If j in a double-jump is not quadratic, then there are linear L and 
M and ritty I, j and r such that (L o l,j,ro M) is linearly equivalent to (l,j, r). 

Proof. If j = A o C p o B is odd type C, we apply the previous lemma to (I o 
A, C p , B or). Since f 2 * (U j, f) is defined, j cannot be type J, as per remark [3. 1.51 
The hypothesis of the lemma explicitly rules out quadratic j . The only possibilities 
left are that j is type B, type coJ, or an odd monomial. In any case, since (l,j) 
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and (j, r) are both swappable, neither I nor r can be type C, so Proposition 13.2.61 
applies and the rest is routine. □ 

We now give a proof of one of the directions of Proposition 13.2.51 and note 
that the same proof gives the other direction simply by reading equation 1141 below 
right-to-left instead of left-to- right. 

Proof. Going back to the original question and naming everything, suppose that 
tit2ti * (do, fro, c o) is defined; we need to show that i 2 iii 2 * (ao, fro, Co) is also defined 
and equal. In 

(14) (ao, fro, co) ^ (oo, ci, fri) & (c 2 , a 2 , h) i4 (c 2 , fr 3 , a 3 ) 

Co double-jumps (<zo,fro) from the right, and ao double-jumps (ci,fri) from the left, 
with Ci playing the role of j in the first case, and a 2 in the second case. 

Since a o c\ = c 2 o a 2 is a Ritt swap, c\ and a 2 cannot both be quadratic. Then, 
by Lemma [3.2.111 we get compatible descalings of (ao,ci,fri) and (c 2 ,a 2 ,fri), and 
the rest is routine. □ 

3.5. canonical forms. 

3.5.1. motivations and definitions. We now work on finding canonical forms for 
words in such that every word has an equivalent (defined below) word of this 
canonical form. In particular, equivalent words represent the same permutation 
(see Definition ^. 2. 4|l . We obtain these canonical words w from the original word w 
by a sequence of syntactic operations on substrings of w. We just proved Lemma 
13.2.61 which shows that we may 

(1) replace UU by the empty string; 

(2) replace Utj by tjU for non-consecutive i and j; 

(3) replace U + ititi+i by UU+iti, or vice versa. 

These operations allow us to obtain equivalent words of two canonical forms, one 
roughly corresponding to an insert-sort, the other to a merge-sort, if one thinks of 
permuting factors as putting them in a particular order. Surely these combinatorial 
computations have been worked out by computer scientists, but we failed to find 
a reference in the literature. To the best of our understanding, our results on 
canonical forms do not follow easily from [13) . Although our main result of this 
section, Theorem 13.31 can be immediately deduced from their work, we need these 
canonical forms in the next section [3.6.11 whose results cannot be obtained easily 
from their work. 

Remark 3.2.2. If a word v is obtained from a word w via finitely many applications 
of rules (1), (2) and (3) above, then they represent the same permutation, the 
length of v is less than or equal to the length of w, and for any decomposition /, if 
w-k f is defined, the v* f = w* f. It may be that v*f is defined while w-kf is not. 

Definition 3.2.8. If two words v and w in can be obtained from each other 
by operations (2) and (3) above, we write w » v. 

Definition 3.2.9. A word w is length-minimal if no strictly shorter word v may 
be obtained from w via finitely many applications of rules (1), (2) and (3) above. 
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To shorten our proofs, we will assume that we start with a length-minimal word, 
and reach a contradiction every time we get a chance to cancel UU. Equivalcntly, 
we could induct on the length of the word and cite the inductive hypothesis every 
time we get a chance to cancel tiU obtaining a shorter equivalent word. We do not, 
because there are already far too many inductions in these proofs. 

3.5.2. The first canonical form: A sequence of Ritt swaps in the first canonical form 
acts by moving several factors fb t in the decomposition some number of steps to 
the left; the first factor it moves, fb 1 , begins to the left of the next factor moved, 
fb 2 , which begins left of fb 3 , etc. In computer science, this is called an insert-sort: 
having arranged fk through /j+i in the right order, this sequence inserts fi in the 
required place among f k through and then proceeds to and so on, until 
all factors are ordered the right way. 

Proposition 3.2.7. For every w E M k there exists a unique w £ M k which repre- 
sents the same permutation as w and has the form 

W = {ta k t ak -l ■ ■ ■ tb k ){t ak _ l ta k _ l -l ■ ■ ■ tb k -i ) • • ■ (ta 1 ta 1 -l ■ ■ -tbi) 

with ak > bk for all k, and bk < bk-i < ■ ■ ■ < b\. 

This w is obtained from w by operations (1), (2), and (3) above, so for any 
decomposition f such that w* f is defined, w * f = w * f . 

Proof. We begin by replacing w by some w' that has the shortest length among 
words that can be obtained from w by operations (1), (2), and (3) above. This 
means that as we continue to perform these operations on it/, we should never be 
able to perform operation (1) as that would shorten the word. 

Without the requirement that bk < fofc-i < . . . < &i, the proposition follows triv- 
ially, by cutting the word w' into maximal consecutive-decreasing- index substrings. 
So it is only when b i+1 > bi that we need to do anything. Note that any substring 
of w' is also length-minimal: 

Lemma 3.2.12. If w' — tuv is length-minimal, then u is length-minimal. 

Note that if we can fix one pair of out-of-order fe^'s, we can fix everything in 
finitely many steps; so it suffices to prove the proposition for 

w' = (tJa-l ■ ■ ■ t b )(t c t c ^i . . . t d ) 

If b < d, then w' is already of the desired form. So assume b > d. 
Now compare 6 and c: 

• If b > c + 1, then ii = (t c ^c-i • ■ - tdjitata-i ...tb) works, because in this 
case each ti in the first chunk of w' commutes with each tj in the second 
chunk, and b > c + 1 > d. 

• If b = c + 1, w' is already of the desired form, a single consecutive- 
descending-indcx string. 

• If b = c, operation (1) shortens the word w' contradicting length- minimality. 

• This leaves the case where c > b> d, which needs a lemma: 

Lemma 3.2.13. • ifr+1 > r > s, then t r (t r+ it r t r -i . . . t s ) w (t r+:L i r i r _i . . . t s )t r+ \ 

• if p > r > s, then t r {t p t p -\ . . . t s ) « (t p t p -i . . . t s )t r+ i 

Proof. For (1), t r t r+ \t r w t r+ it r t r+ i, and then t r +\ commutes with t r -\ through 
t s . For (2), note that t r commutes with t p through t r+2 and then (1) applies. □ 
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Returning to the special case of the proposition, we compare a and c: 

• If a < c, then the lemma can be applied to each tj, for a > i > b giving 
w' ~ (i c i c _i . . . td)t a +it a ■ ■ ■ tb+i ='■ w, of the desired form because d < b 
implies d < b + 1 . 

• If a > c, the lemma can still be applied to each ti for c — 1 > i > b giving 

w' = (t a t a -i ■ ■ ■ tb)(t c t c -i ...td) = 
— t a . . . t c (t c -\ . . . tbt c t c ^i . . . td) ~ 
~ t a ■ ■ ■ t c (t c £ c _i . . . tdt c . . . tfe+i) 
contradicting length-minimality. 

This finishes the proof of the special case of the proposition, and the whole 
proposition follows. □ 

Corollary 3.2.3. If two words w and w' represent the same permutation and both 
w -k f and w' * f are defined, then w -k f = w' * / . 

Proof. For every permutation there is a unique word in the first canonical form 
representing it. □ 

The first canonical for is used a lot, in particular to obtain words in the second 
canonical form in the next section 13.5.31 One immediate consequence is a bound 
on the length of words and the number of decompositions of a given polynomial. 

Corollary 3.2.4. For any given polynomial P and decomposition (/fc,--./i) of 
P, there are at most k\ other decompositions g of P (up to linear equivalence, of 
course), and any one of them can be reached by a sequence of at most k ^ k ^^ Rat 
swaps. 

3.5.3. second canonical form. The second canonical form is for refactoring de- 
compositions of polynomials that come pre-factored into chunks: suppose P = 
Pf o . . . oPi and we have a decomposition (fi >ri ,■■■■> fi,i) for each P im Then we want 
to first do as much shuffling as possible within the decompositions of Pi, and only 
then move factors between them. This corresponds to a merge- sort, where first the 
factors in each chunk are put into the order in which they will appear in the final 
decomposition, and then the chunks are merged. More precisely, 

Proposition 3.2.8. Given a polynomial P = F t o. . .oF±, decompositions (fi >ri , ■ ■ ■ , fi,i) 
for each F i: let f — (ft, rt i ■ ■ ■ /t,ij ft-in-u ■ ■ ■ fn) be the decomposition of P ob- 
tained by concatenating the decompositions of the F t . 

For every word w such that w*f is defined, there is another word w = vw\W2 ■ ■ ■ Wt 
such that: 

• w -k f — w * f 

• each Wi only permutes factors fij , so W\W2 ■ ■ ■ Wt*f is still a concatenation 
of decompositions of Pi . 

• each of Wi and v is in the first canonical form. 

• v never switches factors fij and fij' originating in the decomposition of 
the same Fi. 

The last item in the conclusion of the proposition is a bit vague, to be made 
more precise in Lemma 13.2.151 below. The rest of this section constitutes the proof 
of this proposition. 
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Remark 3.2.3. Although the statement of the proposition seems to be about a 
decomposition, this is in fact a purely combinatorial result that only depends on 
the numbers r» and not on the particular polynomials. The new word will be 
obtained from the old word by operations (1), (2), and (3) defined at the beginning 
of this section, so the same new word will work for any decomposition. Given a 
tuple of positive integers n and a permutation of k = r i elements, there is 
a unique word in Mk representing that permutation and satisfying the last three 
requirements of the proposition, so w is unique and does not depend on /. 

Lemma 3.2.14. It is enough to prove the Provosition \3.2.8\ for t = 2. 

Proof. We induct on t with t — 2 as base case. To prove the case t = s + 1 from 
t — s, first write 

P = G 2 ° G\ where G 2 = F s+ i of s o...of 2 and G\ = F\ 

We first apply the case t = 2 to w and P = G2 ° G± getting w ~ vq o u± o u 2 - 
We then apply the case t = s to u 2 and G 2 = F s+1 o F s o . . . o F 2 , getting u 2 ~ 
v'w 2 w' 3 . . . w' s+1 . So w WQUiw'w^wJj . . . w' s+1 « vqv'uiw 2 w' 3 . . . w' s+1 , the second 
equivalence because u\ and v' act on disjoint sets of factors. Finally, we notice that 
letting v = Vqv', wi = m, and Wi — w[ for i > 2 is of the desired form. □ 

Let us restate Proposition 13 . 2 .81 for t — 2 with fewer subscripts and more preci- 
sion: 

Lemma 3.2.15. ( Proposition 1 3. 2~8\ for t = 2) Suppose that we have a polynomial 
F = H o G, decompositions (j9ki • • • 9\) of G and (h r ,...hi) of H, and a word 
w £ Mk+r- Let f :— (h r , . . . h\,gk, ■ ■ ■ gi). so f is a decomposition of F . 
Then there is another word w = vwqWh such that: 

• W~k f = W-k f 

• wg it a word in t\ through tk—i, so it only permutes factors of G; 

• wh it a word in tk+i through t r +k—l> so ^ only permutes factors of H ; 

• all three pieces v, wg, and wh are in the first canonical form, so in partic- 
ular 

v = (ta m tam-l ■ ■ ■ 'i>m)('»«i-^<>m-i-l ■ ■ ■ ^b m -i) ■ • ■ ■ ■ -tbi) 

• v never switches factors that both originate in H or both originate in G. 

hi = k, 62 = A; — 1, . . . b m = k — m + 1 and a\ > a% > . . . a m 

Proof. We may assume without loss of generality that w is already in the first 
canonical form: 

w = (tc n t Cn -l ■ ■ ■ td n )(tc n _ 1 t Cn _ 1 -l ■ ■ ■ td n -l) ■ ■ ■ {t Cl t Cl -l ■ ■ ■ 

with Cfc > dk for all k, and d n < d n -i < . . . < d\. 

Now wh is going to be the largest right substring of w that doesn't touch the 
factors of G. More precisely, Let j be the greatest index for which dj > k, and let 

wh := (t Cj t Cj -i . . .tdj) ■ ■ ■ (t Cl t Cl -i . . .trfj 

Then 

W = {t c J Cn -l . . .td„)(f c „_ 1 i C „_ 1 -l ■ --tdn-x) ■ ■ ■ (*c i+ i*c i+ i-l ■ • -td j + 1 )w H ■■= w'w H 

Rewriting w' (what's left of w) as vwg requires actual reordering for two reasons 
corresponding to the two new requirements: it is possible that some d is too small: 



30 



ALICE MEDVEDEV AND THOMAS SCANLON 



dj+i < k or di+i < di — 1 for some i; or that cj+i > As we did in the proof of 
the first canonical form, we start unwrapping to' from the right, maintaining the 
following inductive hypotheses: 

• w' w v bad v good u G , 

• uq only permutes the factors of G, 

• u S ood = (£«ti*ai-i ■ ■ -ifc+i-0 ■ ■ ■ (ta 2 ta 2 -i ■ ■ ■tk-x)(t ai t a - L -i ■ ■ . ik) satisfies the 
requirements for w, i.e. has ai > a2 > . . . a;, 

• Vbad is in the first canonical form in chunks (t Ci . . . td t with all < k + 1 — I 
(I is from the previous item on this list). 

We initiate the induction by collecting as much as possible in uq, i.e. setting 
uq to be the maximal right substring of w' which only uses t\ through tk-i- Let 
w" be what's left, i.e. such that to' — w"ug- Since the indices in the ith chunk 
(t Ci t Ci -i . . .tdi) of w' begin with di < k and increase rightwards, the first from the 
right index > k is k. So the rightmost chunk of to" indeed has k as its lowest index, 
so we can set v goo d to be that chunk. We set Vbad to be what's left, still in the first 
canonical form, with every chunk beginning with di < k. 

The induction step will shorten Vbad by one chunk. So we need to find a word 
equivalent to 

1 '■= (t c ■ ■ ■ td){t ai t ai --l ■ ■ ■ ife+1-0 • ■ ■ {ta 2 ta 2 -l ■ ■ ■ *fc-l) (*oi*oi -1 • ■ ■ **) 

that looks like a v goo d u ~G 

We first deal with the possibility that d is too low, namely that d < k — I. 
If c < k — I also, then the whole left chunk (t c . . . t^) commutes with v goo( i, so 
if we let Vg~^d '■— v goo d and uq := (t c . . . tj), we're done. Otherwise, (t c . . . tj) = 
(t c . . . ifc-;)(^fc-i-i . . - td), the right half of which commutes with v goo d, so now, with 
u~g ■= (tk-l-i ■ ■ - td), 

q W (t c ■ ■ ■ tk-l)(t ai t a ,-l ■ ■ ■ tk+l-l) ■ ■ ■ (ta 2 ta 2 -l ■ ■ ■ tk-l){t ai t ai -l ■ ■ ■ tkjUQ 

Now the worry is that c > a;. Since a; >fc + l — I > k — I, 

t c ■ ■ ■ tk-l = (t c - ■ ■ t ai +i)(t a , ■ ■ ■ tk-l) 
Computations just like in the proof of the first canonical form give 

(t a , . . . tk-l)(t ai tai-l ■ ■ ■ tft+l-l) ~ (t ai -l ■ ■ ■ tk-l)(ta t ■ ■ ■ tk+l-ltk-l) 

Now the rightmost tk-i commutes with the rest of v goo d, and (t c . . . t ai+ \) commutes 
with (t ai -i ■ ■ ■ tk—i), so we get that 

q ~ (tai-l ■ ■ ■ tk-l)(tc ■ ■ ■ tai+l)(tai ■ ■ ■ *fc+l-i)(*oi-i*oi-i-l ■ • ■ *fe+l— (i— 1)) ■ ■ ■ 
• ■ ■ (ta 2 ta 2 ~l ■ ■ ■ tk~l)(t ai t ai -l ■ ■ ■ tk)tk-l 

The rightmost tk-i can be added into uq. As long as c > a^, we repeat this 
procedure, moving (t c . . . t ai +i) past the ith chunk of v goo d, until he's finally in his 
rightful place. 

We are now done with the induction step, and hence with the proof. 

□ 



Combining the two lemmas proves Proposition 13 . 2 .81 
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3.6. Using second canonical form to show that almost everything comes 
form skew-twists. Finally, we put our technical tools to use. Though you may 
have forgotten over the past twenty pages (assuming we told you in the first place), 
our quest is to characterize (/, g)-skew-invariant curves for trivial polynomials / and 
g, i.e. non-linear / and g that are not skew-conjugate to monomials or Chebyshevs. 
Remember (if we ever told you), for linear L, two polynomials / and g :— LFofoL^ 1 
are called skew-conjugate. Most of these skew-invariant curves come from skew- 
twists, described in the next section. Here, we mop up the few curves that do 
not. Actually, we will say nothing about invariant curves until much later; for now, 
we characterize certain commutative diagrams of polynomials with coefficients in 
a difference field. More precisely, we characterize triples of polynomials (/, g, tt) 
satisfying TT a o / = g o tt, where / and tt share no initial compositional factors, and 
ir a and g share no terminal compositional factors. 

The main result of this section, Theorem 13.31 follows immediately from Lemma 
2.8 of |13j . However, the new tool we develop to prove this theorem, chebyclumps, 
will be used extensively in the next section, and is not present in [13] . 

Theorem 3.3. If two trivial polynomials f and g satisfy go tt = tt" of , and f andir 
share no initial compositional factors, and tt" and g share no terminal compositional 
factors, then there are linear L and M such that M o tt o L is a monomial and both 
L a o / o L _1 and (Af 7 ) -1 o g o M have decompositions where all factors are ritty 
and the degree of tt divides the out-degree of all factors of some decomposition of f 
and the in- degree of all factors of some decomposition of g. 

One can remove much of our terminology from this theorem, in particular re- 
moving all references to decompositions. Then the theorem would read 
"If two polynomials / and g satisfy g o tt = ir a o /, and / and tt share no ini- 
tial compositional factors, and tt 17 and g share no terminal compositional factors, 
then there are linear L and M such that either L a o f o and (M tT )~ 1 o g o M 
are both monomials or Chebyshev polynomials (and then we say nothing about 
tt); or M o tt o L(x) = x n is a monomial, L" o / o L^ 1 {x) = x k ■ u(x n ), and 
(M CT ) _1 o g o M(x) — x k ■ u(x) n for some polynomial u." 

We break down the proof into three propositions: a translation into the language 
of decompositions and canonical forms, a proof of the theorem in case none of these 
polynomials have any type C factors, and a proof of the theorem in case one of the 
polynomials does have a type C factor. 

Proposition 3.3.1. (Translating the theorem) 

Suppose that polynomials f , g, and tt satisfy gon — tt° of , and that f and tt share no 
initial compositional factors, and tt g and g share no terminal compositional factors. 
Let m be the number of factors in (any) decomposition of it, and let I be the number 
of factors in (any) decomposition of f (or g). Then there are decompositions tt of 
tt, f is f, g of g, and p of tt" (which p need not be (7r) CT ) such that 

(ti . . . ti) . . . (t; +m _ 2 ■ ■ ■ t m -l)[tl+ m -i . . . t m ) -kgTT^pf 

Proof. Let (7r m , . . . , tti) be a decomposition of tt, and (gi, ... ,g±) be a decomposition 
of g. Let w = vw\W2 be the word in the second canonical form that yields a 
decomposition of / followed by a decomposition of tt" . Since we were free to choose 
the decompositions of tt and g, we may assume, losing this freedom, that Wi are 
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empty. So we get decompositions as above and 

v = {ta k ta k ~l ■ ■ ■ tb k )(t ak - 1 ta k - 1 -l ■ ■ ■ tbk-i) ■ ■ ■ V>ai ^Oi — 1 • • • *6i) 

with a, > bi — 1 for all i ( aj = bi — 1 means that the word (t ai , . . . t^) is empty) ; 
bi = length(7r) +1 —i, and au < ■ ■ ■ < a,2 < di] and 

v*gTr = pf 

Now it follows immediately that k = length(7?), for otherwise t\ does not occur in 
v, so the rightmost factor 7ri in gjr is untouched by the action of v, so it is a shared 
initial factor of tt and /, contradicting a hypothesis of the proposition. 

For exactly the same reasons, unless a; = length (g) + length (7?) — i for all i, p 
and g will share a terminal factor, which is also not supposed to happen. 

So v = (ti . . .ii) . . . (tj+m-2 • • ■ t m -i)(ti+m-i ■ ■ ■ t m ) as wanted. □ 

We have v := V1V2 ■ ■ ■ v m where vi := U + i-iti + 2-i ■ ■ ■ ti, and we have v * git 
defined. In the next proposition where none of the gi or 71^ are type C, we will 
make repeated use of Proposition 13.2.61 We will then introduce chebyclumps and 
make a series of observations about them in order to handle the other case. 

Proposition 3.3.2. (the Descaling case) 
Theorem \3.3\ holds if g and ir have no type C factors. 

Proof. We may without loss of generality replace g and 7? by linearly equivalent 
decompositions that admit descalings with only gi and ix\ non-monic, so that the 
concatenation of the two descalings is a descaling of 177?. Since none of the gi or iti 
are type C, Lemma \'3 . 2 . 9 1 and Proposition 13 . 2 .61 apply to the action of each Uj, and 
an easy induction on i shows that there are linear L and M and ritty hi, Oi such 
that gi = M o hi, gi = hi for all other i, tti = 01 o L, and Hi = Oi for all other i. 
So 

g o 7r = M o hi o hi— 1 o . . . o hi o Ofc . . . o 02 ° 0\ o L 

and 

TT 17 O / = M O Ok ■ ■ ■ O 02 O Ol O hi O hi— 1 o . . . o hi o L 

where the ritty polynomials with tildes are the result of the Ritt swaps where the 
linear factors witnessing the Ritt swaps are all identity. 

By assumption, none of the factors are type C. None of the o are type J, since 
those cannot swap in the necessary direction. So all non-monomial Oi have well- 
defined in- and out-degrees; for each i, the in-degree of bi is higher than the in-degree 
of the corresponding Oi (by a factor of deg g) , contradicting the fact that bi are the 
factors of Ti a and Oi are the factors of 7r. Therefore, bi — Oi are monomials for all 
i. For some linear N , N o Ok ■ ■ ■ ° 02 ° o± o L is a decomposition of 7r, so Li := L^ 1 
and Mi := N~ x are the unique translations such that M\ o ir o L\ is a monomial. 
They witness the conclusion of the theorem. □ 

To treat the last possibility, that at least one factor of g or 7r is type C, we 
must turn aside and contemplate how type C factors play with each other. These 
observations will be used again heavily in the next section. 
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3.6.1. Chebyclumps. We show that clumps of compatibly-scaled type C factors per- 
sist (up to invading quadratic factors) under Ritt swaps, and that Ritt swaps in- 
volving two odd Chebyshevs can only occur within these clumps. 

Definition 3.3.1. Let / be a decomposition of a polynomial /. 

If L o fj o . . . o fi o M = C n for some integer n that is not a power of 2, and some 
linear L and M, we call (fj, ■ ■ ■ fi) a chebyclump of the decomposition. 

If in addition neither (fj+i, fj ■ ■ ■ fi) nor (fj, . . . fi, fi-i) is a chebyclump, we call 
(fj, . . . fi) a maximal chebyclump of the decomposition. 

A chebyclump is called odd if n is odd. 

Note that this definition is invariant under linear equivalence. Let us see how 
this notion interacts with Ritt swaps. 

Observation 3.3.1. (1) If one of the f\ and / 2 is type C, then (fi, fx) is swap- 
pable if and only if the pair is a chebyclump. 

(2) Every decomposition of C„ is linearly equivalent to a decomposition all of 
whose factors are C p for prime p, including p = 2. 

(the property "all factors are C p " is invariant under Ritt swaps.) 

(3) If (fj, . . . , fi) is a chebyclump so that LofjO,..of i oM = C n , let gi := 
Cde g (/i); then (L o fj,fj-i, ■ f i+ i, fi o M) is linearly equivalent to g. 
(immediate from previous) 

(4) The converse to the previous item is obvious. 

(5) For every type C indecomposable c, there are unique up to ±1 linear T and 
S such that T o c o S — C p . For C 2 , there is at most one such S given such 
a T, and vice versa: (+2A 2 — 2) o (-A 2 ) o C 2 o (-A) = C2 for each non-zero 
A. 

(6) L and M in the definition of chebyclump are unique up to multiplication 
by ±1. 

(Combine (2) and (4) and the fact that at least one factor in a chebyclump 
has odd degree.) 

Lemma 3.3.1. // (fj, . . . /,+i) and (fi, . . . , fk) are both chebyclumps, witnessed by 
L, M for the first and by L, M for the second, then the concatenation (fj, . . . fk) is 
a chebyclump if and only if L o M = (• ± 1). 

Proof. The back directions is immediate; we prove the forward direction. In other 
words, we have polynomials / and g and linear A, B, C, D, L, and M , such that 
B^ 1 0/0 A^ 1 = C m and D^ 1 ago C _1 = C n and Logo f oM = C mn . 
So / = B o C m o A and g = D o C n o C. 

Since chebyclumps must have at least one factor of odd degree, and factors can 
move freely within the chebyclump, we choose decompositions (BoC p , fk-i, ■ ■ ■ fi) 
of / and (g h . . . , g 2 , C q o C) of g. 

Then (gi, . . . , g 2 , C q o C, B o C p , fk-i, ■ ■ ■ fi) is a decomposition of g o / = L^ 1 o 
Cmn M _1 . According to observation 3 above, it must be linearly equivalent to 
the decomposition (L^ 1 oC dcg(gi) , . . . , C dcg(g2 ), C q , C p , C deg (/ fc _ 1 ) . . . C dog(/l ) oM^ 1 ). 
But type C ritty polynomials arc not non-trivially linearly related to themselves 
except for (-1) * C p = C p , so C o B = (■ ± 1). □ 

Corollary 3.3.1. The unique L and M are invariant under Ritt swaps within the 
chebyclump. 
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Proof. Without loss of generality, all of L o fj, fj-i, . . . , fi+i, fi ° M are actually 
Chebyshevs, in which case the assertion is verified immediately by writing out the 
Ritt swap. □ 

Lemma 3.3.2. If fb has odd degree, then (f c , ...,/(,,..., f a ) is a chebyclump if 
and only if both (f c , . . . , fb) and (fb, . . . , f a ) are chebyclumps. 

Proof. The forward direction is obvious, so we assume that both (f c , . . . , f b ) and 
(fb, ■ ■ ■ , fa) are chebyclumps and prove that (f c , . . . , f b , . . . , f a ) is one. 

Without loss of generality, we may assume that, for some Chebyshev polynomials 
g c and gb and linear A and B, f c = A o g c , f b = g b o B, and that fi are themselves 
Chebyshev for c > i > b. 

On the other hand, (f b , . . . f a ) is linearly equivalent to (Cog b ,g b -i, . . . g a+ i,g a oD 
for linear C and D and Chebyshev gi. The occurrence of gb here and in the previous 
paragraph is not an accident: it is indeed the same Chebyshev of the same degree 
as / fc . 

Let Lfc, . . . L a+ \ witness this: 
jfcoLh = Cog fc , L fc _1 o/fc_ 1 oLh_i = g b -\, ... , i a+2 o/ a+1 oL a+ i = g a+1 , L~\ Y of a = g 

Since f b = g b o B, the first of the above gives gb ° B o L b = fb ° Lb = C o gb. 
But g b is a Chebyshev polynomial of odd degree, only linearly related to himself by 
(-1) * g b = g b , so B o L b = C = (■ ± 1). 

Now the same linear factors Li inserted in the same places witness that (f c , . . . , f b , . 
is linearly equivalent to (f c , f b+1 , f b o L b , L^ 1 o f b _ x o L b -i, L~+j o /„) = 
(A o .g c ,.g c -i, . . . ,g b+ i,g b o (■ ± 1), . . . g a +i,g a ° -D 

The (• ± 1) linear factor can be pulled left out of the middle of a chebyclump: 
C p o (• — 1) = (• — 1) o C p for odd p, and C2 o (• — 1) = C2, so this is indeed a 
chebyclump as wanted. □ 

The purpose of all those technical bits is: 

Lemma 3.3.3. Suppose (fj, . . . fi) is a maximal chebyclump of the decomposition 
f of f. Let g be another decomposition of f obtained from f by a single Ritt swap. 
Then (g r , . . .g s ) is a maximal chebyclump of g for some r G {j — l,j,j + 1} and 
somes G {i — Further, the oddparts of the degrees of the two chebyclumps 

are the same. 

Proof. First note that Ritt swaps within the chebyclump, and Ritt swaps among 
factor neither in nor adjacent to the chebyclump have no effect. Thus we have 
nothing to prove except for the Ritt swaps at (j + 1), j, (i — 1) and (i — 2). Note 
that in any case (gj-i, ■ ■ ■ <7i+i) is still a chebyclump. Thus a maximal chebyclump 
cannot lose more than one factor though a single Ritt swap; since Ritt swaps are 
invertible, this means that a maximal chebyclump also cannot gain more than one 
factor. It remains to show that the factor gained (or, symmetrically, lost) cannot be 
odd. Suppose towards contradiction that a Ritt swap at (j + 1) adds an odd factor 
to the chebyclump; that is, tj + i * (b, a, fj, . . . fi) — (d, c, fj, . . . fi), where c is odd 
type C and (c, fj, ■ ■ ■ fi) is a chebyclump. Since (d, c) is swappable, by observation 
1 above, (d, c) is a chebyclump. Now by the last lemma the whole (d, c, fj, . . . fi) 
is a chebyclump, giving the desired contradiction. The proof for the Ritt swap at 
(i — 2) is identical. □ 
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The next (hard) technical lemma describes the very rare situations when two 
quadratic factors can get all the way across a decomposition. 

Lemma 3.3.4. Let f be a decomposition of length k of f with deg(/ 2 ) = deg(/i) = 
2; let vi := £fc-i^fc-2 . . . t 2 and v 2 := t/-_2ifc-3 • • -t\ and v = v 2 v\; suppose that 
v-k f is defined. Then either f is a chebyclump, or f is linearly equivalent to (A& o 
7fe, 7fe_i . . . , 73, Q, Q o L) for some linear L and Ak, where each 7; is a monomial 
or a ritty polynomial whose out-degree a multiple of 4. 

Proof. Up to linear equivalence, we may take / := . . . fo, Q, M 0Q0L) for some 
linear L and translation M. 

Since t 2 -k f is defined, there is a linear A3, an integer r%, and a monic polynomial 
it3 such that either A3 1 0/3= x r3 • u^(x) 2 or A3 1 o / 3 is a monomial. Let := 
x r ' 3 ■ u^(x 2 ) in the first case, and the same monomial in the second case. Then 
*a*/= (A,.. 4 /4,A 3 oQ,a 3 ,M 0Q0L). 

Since tst 2 -kf is defined, there is a linear A 4 , an integer and a monic polynomial 
U4 such that either Aj 1 o / 4 o i 3 = a; 1 " 4 • u 4 (x) 2 or A 4 1 o / 3 is a monomial. Let 
0:4 := x ri -Ui{x 2 ) in the first case, and the same monomial in the second case. Then 
hh * / = (fk, ■ ■ ■ h, A 4 o Q, a 4 ,_p3, M 0Q0L). 

Inducting, since U . . .t^t 2 * / is defined, there is a linear 1, an integer r; + i, 
and a monic polynomial tij+i such that either AZj °/t+i = x Ti+1 -Ui+i(x) 2 or it 
is a monomial. Let a^+i := x Ti+1 -u i+ i{x 2 ) in the first case, and the same monomial 
in the second case. Then t{ . . .t 3 t 2 * f — (/ fc , . . . fi +2l A i+ i oQ^oti . . . 0:3, M o Q o L). 

And at the end we get 

"l * / = i A k Q, c%k, ...a 3 ,M 0Q0L) 

In the rest of the proof we will name many linear factors witnessing Ritt swaps 
between ritty indecomposables. In each of these Ritt swaps one of the indecom- 
posables is Q, so none of these Ritt swaps will involve commuting odd Chebyshevs, 
so we may and will always choose these witnesses to be translations (see Remark 

EXT]). 

Now since t\V\ */ is defined, (a 3 , M o Q o L) is swappable. Thus, there must be 
some translation N , integer s, and polynomial v such that Noa^oM = x s -v(x) 2 or 
N o a 3 o M is a monomial. There are two distinct possibilities: either M = N = id, 
or 0:3 and N o a 3 o M are two non-trivially translation-related ritty polynomials. 
The rest of the proof goes differntly in the two cases, yeilding the two different 
conclusions of the lemma. 

Case 1: Suppose that M = N = id. Then either a 3 is a monomial, or a 3 = 
x S3 ■ vz{x) 2 ] let /?3 := a 3 if it is a monomial, and /3 3 := x S3 ■ ^(a; 2 ) otherwise. Then 
ti v i * / is linearly equivalent to (Ak ° Q, c*u, ■ ■ ■ cti, Q,f3s, Li). 

Now since t 2 t\V\ */ is defined, (a.±, Q) is swappable. Since there are no solutions 
to translation o ritty = ritty, this means that either 04 is a monomial, or 0:4 — 
x Si ■ V4(x) 2 ; let /?4 := a 4 if it is a monomial, and 04 := x Si ■ V4(x 2 ) otherwise. Then 
t 2 tivi * / is linearly equivalent to (A^ o Q, ak, ■ ■ ■ a$, Q, /?4, $3, L). 

Inducting, for each 4 < i < k, since . . . t 2 t\V\ * f is defined, (cti,Q) is 
swappable. Since there are no solutions to translation o ritty = ritty, this means 
that either cti is a monomial, or cti = xSi ' ^i(a^) 2 ; let /3j := cti if it is a monomial, 
and /3i := x Si ■ Vi(x 2 ) otherwise. Then for i < k we get . . . t^tivi * / is linearly 
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equivalent to (Ak ° Q, a>k, ■ ■ ■ di+i, Q, Pi, ■ ■ ■ 03, L), and finally 
v 2 vi * / = (A k ° Q, Q, /3fe, • • • 0s, L) 

Note that in this case, for each i we have that either = a, is a monomial, or 
a.i = x Si ■ Vi(x) 2 and /3j = x Si • ^(x 2 ). Remember from the first half of the proof 
that ai = A^ 1 ° /j o if it is a monomial, and that otherwise a, = x 1 ** • Ui{x 2 ) 
and A" 1 0/^0 = a; ri -Wi(a:) 2 . So either = a, = A^o/ioA^i is a monomial; 
or aj = x Si • Vi{x) 2 — x Ti ■ Ui(x 2 ), i.e. r*j = Sj and there is some polynomial Wi 
such that Q!j = x Ti ■ Wi(x 2 ) 2 and A~ o/;o = a; ri ■ uii(x) 4 . Let 7, = ctj if it 
is a monomial, and otherwise let 7* := x ri • i«j (x) 4 = Ar 1 o ft o Now A fc _i 

through A3 witness the conclusion of the lemma. 

Case 2: The other possibility is that N o a 3 o M — x s ■ v(x) 2 with at least one 
of N and M not identity. Then is not a monomial, so we see form the first 
part of the proof that its in-degree is at least 2. Thus, 0:3 is not type J, so it must 
be type C, and in fact A3 * C P3 for some A3. We now wish to insert scalings into 
the decomposition so as to maintain the monicity of all and turn a 3 into C P3 . 

For i > 3, let Aj+i = A^ 08 *-" 1 ' so that ('j"^") a i i'^-i) = * a i is monic. Let 
Ui := (A 2 ) * Ui so that Ai * at — x Ti ■ Ui(x 2 ). Putting it all together, let 

g := (A k o (-A 2 . +1 ) o Q, X k * a fe , . . . , A 4 * a 4 , C P3 ,M' 0Q0L') 

where M' is a translation such that (-j^) M = M' o (-j^) and L' := ( - -^=) L. 

Since g is linearly equivalent to our previous decomposition of V\ * /, we may say 
that 

vi*f = g 

And now we do the whole thing all over again, applying v-i to g one swap at a 
time. But now we are applying these swaps to <?i, about whom we know a whole 
lot, rather than to /j, about whom we knew nothing. This is why it takes two 
quadratic factors to get the conclusion of the lemma. 

So, back to t\ * g being defined: for some N, s, v, N" 1 o C P3 o M' — x s ■ v(x) 2 . 
But then we must have N = M' = (-2) or N = M' = (+2). In the first case, 
U * g — (. . . , A4 o a 4 , (—2) o Q, C P3 o L'). In the second case, U * g — (. . . , A4 * 
a 4 ,(+2)oQ,i*C P3 oU). 

Now t$bi * g is defined, so for some N, s, v, N^ 1 o A4 * a 4 o M' = x s ■ v(x) 2 . 
Whether M' = (+2) or M' = (-2), this still forces A 4 * a 4 = C Pi and N = M' . 
So the two options for t2t\ * g are (. . . , As * c*5, (+2) oQ,i* C P4 , i * C P3 o L') and 
(...,A5 0Q;5,(-2)oQ,Cp 4 ,Cp3 oL'). 

Induct as before to obtain v-i * g = 

(A o (-A 2 +1 ) o Q, (-2) o Q, C Ph , . . . C P4 , C P3 o L') 

or 

{A o (.A^ +1 ) o Q, (+2) oQ,i* C Pk C Pi , i * C P3 o L') 

In either case the whole decomposition is a chebyclump. □ 

Let us now return to the question at hand, namely to decompositions g, 7? 
at least one of which has at least one type C factor, and to the word v :— 
(ti... h) ... {ti +m -2 ■ ■ ■tm-i)(*i+m-i ■ ■ -tm) such that v * gn is defined. 

Proposition 3.3.3. (type C lemma) 

Theorem \3.3\ holds when at least one of g and tt has a type C factor. 
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Proof. Note that if one of g and 7r has a type C factor, all factors of the other one 
must be either type C or degree 2. Since each factor of g must at some point swap 
with each factor of 7T, they cannot both have quadratic factors. This leaves the 
following options: 

(1) Both g and 7r have a type C factor, and one of g and n is linearly related 
to an odd-degree (decomposable) Chebyshev polynomial. 

(2) g has a type C factor and no quadratic factors, while all factors of ir are 
quadratic. 

(3) 7T has a type C factor and no quadratic factors, while all factors of g are 
quadratic. 

In the first case, the whole gn must be one giant chebyclump, which forces the 
unique linear L and M such that M o ir o L = C n for n = deg(7r) to also make 
L a o / o L^ 1 a Chebyshev polynomial, contradicting its triviality. 

In the second case, if 7r is itself quadratic, then M o 7r o L — Q for some linear 
M and L and the theorem is clearly true. Otherwise, the previous lemma applies 
to gir. If 7T is linearly related to x 4 , the conclusion of the theorem follows. If 7r is 
not linearly related to P4, then chasing the linear factors in the lemma above, one 
sees that g must be skew-conjugate to a Chebyshev polynomial. 

For the third case, we only skecth the proof. First, we break tt into maximal 
chebyclumps: tt := M~ 1 C na o T a o C na _ 1 o . . . o C„ 2 oTjo C ni o L~ l for linear M, 
L, and non-trivial linear Tj. If deg<? > 2, the lemma above applied to 7? " / implies 
that 7T must be a single chebyclump, and chasing the translations in the proof of 
the lemma shows that g is skew-conjugate to Cd eg ( 9 ). If g is quadratic, and m is 
the number of indecomposable factors of 7T, then t\ti . . .t m * gir = pf and p will 
break into maximal chebyclumps the same way that 7? did, but where T a had to be 
scalings in order for ti<2 ■ • ■ t m * gir to be defined, the corresponding linear factors 
in p will not be scalings, contradicting the fact that p is a decomposition of ir a . □ 

3.7. skew-twist monoid. In trying to make this section more readable, we have 
tried to motives the rather technical results as we go along. Unfortunately, the 
motivation often involves concepts not defined until Section 2J The worst offenders 
are trivial polynomials, which are simply polynomials that are not skew-conjugate to 
any monomial or Chebyshev polynomial; and correspondences between a-varieties 
(A 1 ,/) and (A 1 ,^) which are simply (/, <7)-skew-invariant curves. 

3.7.1. skew-linear equivalence. 

Definition 3.3.2. Remember, for linear L, two polynomials / and g := L a o f o L^ 1 
are called skew- conjugate. 

In this case, L : (A 1 ,/) — > (A 1 , g) is an isomorphism of a- varieties and, when 
/, g, adn L are defined over the fixed field of a, of the corresponding dynamical 
systems. For fixed / and g, L also gives rise to a bijection between the set of 
decompositions of / and the set of decompositions of g: if (fk, fk—U ■ ■ ■ > $2, fi) is a 
decomposition of /, then {L a o f k , fk-i, ■ • ■ , /2, /1 L^ 1 ) is a decomposition of g. 

Lemma 3.3.5. Given a linear polynomial L and a decomposition f of a polynomial 
f, let g := (L a o f k ,fk-i, ■■■ ,S%,h ° L ~ V ), a decomposition of g := L a o f o L~ l . 
Thus for fixed f and g, L gives rise to a bijection between decompositions of f and 
decompositions of g, which respects linear equivalence. 
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Definition 3.3.3. Two decompositions / and h are skew-linearly-equivalent if there 
is a linear L such that h is linearly equivalent to (If o /j,, fk-i, ■ ■ ■ , fi, fi ° £ 

Skew-linearly-equivalent decompositions may be decompositions of different, but 
always skew-conjugate, polynomials. It is immediate that 

Lemma 3.3.6. Skew-linear- equivalnce is an equivalence relation. 

3.7.2. skew-twists. 

Definition 3.3.4. The decomposition (/J 7 , . . . , /ja) is called i/ie single-skew-twist 
of the decomposition / := (//., . . . , /2, /i) and denoted </>*/. {<f> stands for "for- 
ward".) 

If / is a decomposition of a polynomial /, then </>★/ is a decomposition of a (prob- 
ably different) polynomial h; we call h a single- skew-twist of /. 

To undo what 4> does, we define /?*/*:= (/&_!, . . . , fi, fjf* (/3 for "back" .) 

Note that while a decomposition has a unique single-skew- twist, a polynomial 
may have several single-skew-twists, coming from different decompositions. In par- 
ticular, single-skew-twists of linearly-equivalent decompositions will be discussed 
shortly. 

If ft, is a single skew-twist of /, then ho f 1 = ff of, so the graph of fx is an (/, h)- 
skew- invariant subvariety of A 2 ; and f\ is a morphism of a- varieties from the one 
defined by / to the one defined by h. We have shown in Theorem 13.31 that under 
most circumstances, all skew-invariant curves come from composing many such 
morphisms, possibly in different directions. This suggests the following definition: 

Definition 3.3.5. For polynomials / and g, the relation "/ is a skew-twist of g" 
is the symmetric-transitive closure of the relation "/ is a single-skew-twist of g" . 

3.7.3. monoid. Similar to the monoid of Ritt swaps acting on linear-equivalence 
classes of decompositions, we now define a monoid of of Ritt swaps and single 
skew-twists, acting on skew-linear-equivalence classes of decompositions. While the 
action of Mk always produced a new decomposition of the same polynomial, the 
action of B/~ will produce decompositions of skew-twists of the original polynomial. 

We start with an analog of the first crucial Lemma 13.1.31 

Lemma 3.3.7. Ritt swaps are well-defined up to skew-linear- equivalence. 
Single skew-twists are well-defined up to skew-linear-equivalence. 

Proof. In Lemma I3.1.3[ we proved that Ritt swaps are well-defined up to linear 
equivalence; so for the first statement, we only need to prove that if a decompo- 
sition (hk, hk-i, • ■ ■ , hi, h\) of / is obtained from / by a Ritt swap at i, then the 
decomposition (L a o hk, /jfc-i, ■ • • , h%, hi oL _1 ) of g :— L a o/oL _1 is obtained from 
g := (L a o fk, fk-i, ■ ■ ■ , /2>/i ° L~ ) by a Ritt swap at i. This is immediate from 
the definition of Ritt swap, with the same linear factor witnesses. 

For the second part of the Lemma, we need two things (g and g stay the same) 

• The decomposition (/f , fk> - ■ ■ f2) obtained from / by a plain skew- twist is 
linearly equivalent to the decomposition ((/i o L -1 )" 7 , L a o f k , f^-i, ■ ■ ■ f%) 
obtained from g by a plain skew twist. 

• If a is obtained from c by a plain skew- twist, 
and b is obtained from d by a plain skew- twist, 
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and d is linearly equivalent to c, 
then a is skew-linearly equivalent to b. 
The first is immediate, and so is the second once the assumtions are written out: 

a= (c^,c fe ,...,c 2 ) 

d = (c k L k ,L k 1 c k _iL k _i, . . . L 3 1 c 2 L 2 , L^Ci) 

b = ((L~ 1 c 1 ) a ,c k L kl L^Ck-iLk-i, ■ ■ ■ L~ 1 c 2 L 2 ) 

Now let L := L 2 X and note that b is linearly equivalent to (L a o a k , a k -i, ■ ■ ■ a 2 , a\ o 
L). □ 

Definition 3.3.6. Let B k be the free monoid generated by {U : 1 < i < k — 1} 

and (j) and fi- 

Let Sf be the set of skew-linear-equivalence classes of decompositions of skew-twists 
of/. 

Lemma 3.3.8. If f is a decomposition of f, then any decomposition of any skew- 
twist of f may be obtained by a finite sequence of the following operations: 

• U : Ritt swap at i defined long ago; 

• (j) defined by <j> * (f k , . . . , /i) = (/f , /*,..., f 2 ); 

• f3 defined by 0*f := (f k -i, ■ ■ ■ , fi, f { k ) )- 

This defines a action of B k on Sf U {oo} ; transitive on Sf. 

3.7.4. correspondences encoded. In the case of the monoid of Ritt swaps acting on 
the decompositions of a single polynomial, the only question was which decompo- 
sitions could be obtained; how they could be obtained only mattered as a tool. 
Now things are different: a word w in B k applied to a decomposition / not only 
gives another decomposition g := w* /, but also encodes a correspondence between 
a- varieties (A 1 ,/) and (A 1 ,^). We care which correspondences. Intuitively, this 
correspondence comes from a long commutative diagram (here, superscripts are 
merely names and not any kind of power) 

™i /=/°. ™i 



f 1 



(15) 



I | 

pi 9 r > p i 

where the horizontal arrows are the polynomials whose decompositions are obtained 
from / by longer and longer subwords of w, and the vertical arrows are cither 
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identity, if the corresponding symbol in w is one of the ti, or a morphism down for 
<f> and up for (3. More precisely, 

Definition 3.3.7. Given a word w := w n . . . W2W1 £ Bk and a decomposition / of 
a polynomial /, let g :— w * / be a decomposition of a polynomial g, and for each 
j < n let := (u>j . . . wi) * / be a decomposition of a polynomial / J ; so /° = / 
and /" = g. 

Then the correspondence encoded by w and f is a correpondence A w between 
a- varieties (A 1 ,/) and (A 1 , g) defined by 

(a, b) G A w if and only if there are ag = a, a%, 0,2, ■ ■ ■ a n = b such that for each j 

• if Wj = ti for some i, then ay+i = ay; 

• if = 0, then a j+ i = 

• if = /?, then ay = / J+1 (aj+i). 

Remark 3.3.1. A careful reader will not permit us speak of the (/, (^-invariant 
correspondence encoded by w-k f, when even tf := w-k f is only defined up to skew- 
linear equivalence, and so g is only defined up to skew-conjugacy! She would worry 
that these linear factors entering at every step of the inductive definition might 
make an intractable mess. These concerns are addressed by Lemma 13.3.71 which 
show that g and C w are well-defined up to a single linear factor. 

It is clear that A w is an (/, (/)-skew-invariant set: one simply pushes the witnesses 
forward by aj t— * f J (a.j) and notes that every box in the diagram above commutes. 
On the other hand, A w will usually be reducible, and its irreducible components 
may be skew- invariant, skew-periodic, or strictly skew-pre-periodic. We only care 
about the skew-invariant components, which suggests the following definition 

Definition 3.3.8. Two words uu and w' in Bk are equivalent with respect to /, 
written w «y w' , if w ★ / = w' * / =: g (in particular, defined), and the two 
correpondences A w and Aw' have the same skew-invariant irreducible components 
(i.e. for any invariant irreducible T>, we have T> C A w if and only if T> C A W '), 
and for any irreducible component £ of one but not the other, (/ x g){£) is skew- 
invariant. 

The last bit of the definition is needed for w to be preserved under concatenation. 
It is harmless, as the only source of equivalent words whose correspondences are 
not identical is the following remark. 

Remark 3.3.2. In particular, the (/, /)-skew-invariant corrsepondence Cp<f, encoded 
by (3(f) is defined by fi(a) = /i(6). One of its irreducible components, the diagonal 
a = b, is clearly (/, /)-(skew)-periodic. However, the image of the whole Cp^ under 
(/ x /) is j us t the diagonal, so all other irreducible components are strictly pre- 
skew-periodic. 

We now show that (f> and (3 commute up to f»; that <f) k and f3 k commute with 
everything up to ~; and that w is preserved under concatenation. 

Lemma 3.3.9. (1) 0/3 w id w /3(f) 

(2) Suppose Ui ~y v±, and so let g := U\ * f = v% * /, and suppose U2 ~^ V2; 
then U2U1 ~fV2V\. 

(3) For any word w in Mk, w(f> k « 4> k w and w(3 k « (3 k w. 
Later, we will also want: 
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• For i < k — 1, ti<j) « 

• Fori> 1, Uj3 pa (3ti-i 

Proof. We indicate the ideas of the proof: 

(1) One part is explained in the remark above, the other comes from the fact 
that "3x f(x) = a and f(x) = 6" is equivalent to a = b. 

(2) Push the witnesses in the definition of A w forward. 

(3) With part (1), we only need to show that <\> k and (3 k commute with Ritt 
swaps. This is so because <p k * / = f and f3 k * / = f( a 1 \ 

For the last two parts, note that the same two factors Ritt-swap on the two sides 
of each equation. □ 

Lemma 3.3.10. For all w £ B^, there is some u € Bk that does not contain (3 nor 
4> k as a substring, and such that w « <f> mk u or w « [3 nk u. 

Proof. We may introduce extra fi 1 ^ 1 pairs into the word w. We introduce enough 
of them to obtain w sa w so that (3 only occurs in multiples of k in w' . Then we 
pull all j3 k to the left, and obtain fi Nk w" « w' where w" contains no instances of 
(3. Then we can also pull all 4> k to the left and obtain (3 Nk (jj Mk u « /3 Nk w" where 
u contains no instances of /?, and no instances of <p r for r > k. Then we cancel /3<f> 
pairs in the beginning. □ 

What geometry is hiding behind this bit of combinatorics? Naturally, A w comes 
with a diagram 

(A 1 ,/)^(A 1 ,?)^...^(A 1 , 9 ) 

each arrow corrseponding to an occurrence of <fi or f3 in w. What we just proved 
is that, for correspondences coming from skew-twists, we can instead look at irre- 
ducible components of the fiber produt of the diagram 

(A\f)^(A\g° N )CL(A\g) 

where we know exactly what one of the morphisms is. In the next two sections we 
describe the usual situation where u in Lemma 13.3.101 can be taken to be of the 
form (jfv where v only contains Ritt swaps, so once appropriate decompositions are 
chosen, we know both arrows in the above diagram. Afterwards, we will need more 
combinatorics to mop up vicious special cases. 

3.8. cracked. 

3.8.1. definitions. 

Definition 3.3.9. Suppose that h = a o b for non-linear a and b; let g := b a o a; 
we then call the triple (ft., g, b) a plain- skew-twist. 

Translating the definition of plain-skew-twists into the language of our monoid 
action, we have the following lemma. 

Lemma 3.3.11. The triple (h,g,b) a plain-skew-twist if and only if there are 
decompositions h of h and g of g and i < k such that g — <fi l * h and the graph of 
b = fi o . . . o fi is the correspondence from h to g encoded by (f> 1 . 
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Definition 3.3.10. A pair (b, a) of non-linear polynomials is called a crack of / 
if / = b o a and for any decomposition / of /, there is an integer m and a linear 
polynomial L such that 

b= f k o ... f m+1 o and a = L o f m o . . . o f x 

We say that a polynomial h is cracked at the edge if for any decomposition h of 
h and for any i 

((fcf 0...0 hi), (h k o h l + 1 )) 

is a crack. Beware the universal quantifier on decompositions! 

If h is cracked at the edge and g is a plain skew- twist of h, we say that g is 
cracked. 

The purpose of this notion is a transitivity for plain skew-twists. 

Proposition 3.3.4. If h is cracked at the edge, and both (h,g,b) and (g,f,d) are 
plain skew-twists, then either (h, f,dob) is a plain skew-twist, or there is some ir 
for which (h a ', /, n) is a plain skew-twist, and tt o h = d o b. 

Proof. Here's the diagram witnessing the assumptions of the proposition: 

1 h=aob^ px 

'"1 

,1 g= b<T ° a = cod ) pi 
d" 

,1 S=d"oc pl 

Let (h n , . . . , h\) be a decomposition of h obtained by concatenating decompositions 
of o and b, so that b = b' o h\ and a = h n o a'; then g = b' a o h\ o h n o a' = c o <i. 
Decompose c and (i in any which way: d = di o . . . o d\ and c = c m o . . . o c\\ then 
g = c m o . . . o ci o o . . . o d\ is a decomposition of g, so a right part of it must be 
a: this is where we use the assumption that h is cracked at the edge to obtain that 
for some linear L 

• either a = Lod r o. . .od\ for some r < I, and then b a = corf ; o. . . d r+1 oL^ x := 
c o d', and d — d' o a; 

• oro = Loc r o...ociod l o...o(i 1 o L~ x := d o d for some r < m, and then 
b a = c m o . . . o c r+ i and c = 6°" o c'. 

In the second case, h — aob ^ d odob and f — d a o c ~ d a o b a o d , so (h, f, dob) 
is a plain skew- twist. 

In the first case, let's insert the information that d — d' o a into the diagram: 



(16) 
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1)1 



h—aob 



»1 



! g=b"oa=cod'oa x 



(17) 



pi fHd'oaroc, pl 

Now forget all about g, but insert a horizontal arrow one level below: 



h=aob^ 



(18) 



/=(d'oa) CT oc 



Now in the bottom square of this diagram (h a , f, d!) is a plain skew-twist, and 
d o b = dl o (a o b) = dl o h as wanted. □ 

3.8.2. translating back to combinatorics. 

Corollary 3.3.2. Suppose that h is a decomposition ofh, h is cracked at the edge, 
u £ Mfe is a sequence of Ritt swaps, and < n, m < k; then there is another 
sequence of Ritt swaps v £ Mk such that cfFucf)™ 1 -kh — cf) m+n v * h. 

Proof. To relate this to the proposition above, let g := cf) m *h and g' := u * g be 
two decompositions of the same polynomial g, and let / := (j) n u(j) m * h be a decom- 
position of /. Further, let b :— h m o . . . hi and let d := g' n o . . .g[. Now h, g, /, 6, 
and d satisfy the hypotheses of the proposition. If (h, /, dob) is a plain skew-twist, 
take v to be the sequence or Ritt swaps that turns h into the decomposition of h 
that has d o 6 as an initial segment. 

Otherwise, take v to be the sequence of Ritt swaps that turns {h) a into the decom- 



position that has d' as an initial segment, and note that 



l (fiV. 



□ 



Corollary 3.3.3. Suppose that h is a decomposition of h, h is cracked at the edge, 



w £ Bk does not contain (3. Then there is some v £ Mk such that 



Proof. Use the previous corrolary repeatedly and carefully. Write 
w := cf) am u m (f) am ~ 1 u m ^i . . .ui(j) ao . Use the previous corollary to find v\ such that 
<f> ai+a °vi «r (f) ai ui4> a ° . Let g := v\ -kh, still cracked at the edge because of the 
universal quantifier on decompositions in the definition of "cracked at the edge". 



Now we're looking at 
by 1. 



m— 1 



. U2< 



mi+ao , 71 



* g, essentially decreasing m 

□ 
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Corollary 3.3.4. Suppose that h is a decomposition of h, h is cracked at the edge, 
and w € Bk; then there is some neN and some v G Mk such that w /3 n v or 
w <p n v. 

Proof. First use Lemma T3. 3.101 to pull all the instances of (3 out left, then use the 
previous corrolary, and then cancel any /3<f> pairs. □ 

Having inducted, let us forget combinatorics and restate the above result: 

Corollary 3.3.5. If h is cracked at the edge, and an irreducible correspondence A 
between h and g comes from skew-twists, then there are decompositions h of h and 
g of g and an integer n such that A is the correspondence encoded by f} n -k h — g or 
by 4> n -k h = g. 

Proof. Where did the v from the above Corrolary go? Into the freedom to choose 
a decomposition of g. □ 

We rewrite the above without naming decompositions. 

Corollary 3.3.6. If g is cracked, and an irreducible correspondence A between g 
and f is encoded by a sequence of skew-twists, then A is either a graph of a a °/^ n 
for some n and for some initial compositional component a of f ; or the correspond- 
ing thing in the other direction. 

3.8.3. who is cracked? Now that we have a theorem about cracked polynomials, we 
should prove that some polynomials are indeed cracked. First, some reminders and 
one new definition: 

Definition 3.3.11. • A pair of indecomposables (a, 6) is called swappable if 

there is a Ritt swap a o b = c o d. 

• An indecomposable is called swappable if it is linearly related to a ritty 
polynomial. 

• A pair of indecomposables (a, b) is called double-jumpable from the right if 
there is some indecomposable c such that t$ti * (a, b, c) is defined. Similarly 
for double-jumpable from the left. 

• A pair of indecomposables (a, b) is called a mixing bowl if there are inde- 
composables c and d such that t2t^t\ * (c, a, 6, d) is defined. 

Note that in each of the above cases, something is not a crack: (a, b) for swap- 
pable, (a, bo c) for double-jumpable from the right, (c o a, b o d) for a mixing bowl. 
The first result will cover the bulk of polynomials and is very easy to prove. 

Proposition 3.3.5. Suppose g is a decomposition of a polynomial g and one of the 

gi is not swappable; then g is cracked. 

Proof. Let h be the plain skew- twist of g such that h\ is unswappable. Then for 
any decomposition / of h, f\ is (the same up to linear-relatedness) unswappable. 
It is easy to see that whenever a is an unswappable indecomposable, (b o a, c) is a 
crack for any b and c. So h is cracked at the edge, and g is cracked. □ 

Now a not-so-hard sufficient condition. 

Lemma 3.3.12. Suppose that for every decomposition h of a given polynomial h 
the pair , h^) is not swappable, nor double-jumpable by indecomposables appear- 
ing in some decomposition of h, nor a mixing bowl witnessed by indecomposables 
appearing in some decomposition of h. Then h is cracked at the edge. 
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Proof. We need to show that ((hfo. . .o/jj), (hkohi+i)) is a crack. Any other decom- 
position of the same polynomial can be obtained from this one by a sequence of Ritt 
swaps in the second canonical form with respect to the three pieces a := (hf, . . . h^), 
(hi, hk), and b := hk—i } ■ ■ ■ hi+i). Swaps within a and b are irrelevant. No swaps 
are possible within (/if, hk) since it is not swappable. Since (hi, hk) is not double- 
jumpable from the left (by ai), hk cannot move left at all. So any decomposition 
of h = a o hi o hk a b can be obtained from (a r , . . . a,j, hi , aJZ\, ■ ■ ■ 0%, hk, b s , . . . bi) 
for some j by moving some of the bi left as per second canonical form. In particu- 
lar, (a r , . . . a j ■,, hi , OjTi , . . . al) will be a left segment of the resulting decomposition 
unless b s can double-jump d\ o hk, or hi o hk if j = 1, from the right. But b s cannot 
double-jump hi o hk since it is not double-jumpable, and b s cannot double-jump 
a! o hk from the right, since that would make hi o hk a mixing bowl. □ 

3.8.4. essential translations. All of these, except when all factors are monomials, 
are covered by the next borderguard section, though the result there is weaker. 
Here, we build on our previous analysis of descalings in section [3.4.11 In this sec- 
tion, we describe how to verify the hypotheses of Lemma f3 . 3 . 1 2 1 for decompositions 
without type C factors. Since we have already dealt with polynomials that have 
an unswappable factor, and we are not ready to deal with type C factors, we make 
the following assumption: 

Notation/ Assumption 3.3.1. Throughout this Section I3.8.4|. all decom- 
positions have no type C factors and no unswappable factors. 

Definition 3.3.12. We say that a left descaling (Mi, hi, Li) of a decomposition is 
clean if Mk is a scaling and Li = id whenever hi o Li is already ritty. 

The point of the definition is that whenever Li ^ id, the descaling has an essen- 
tial translation at (i — 1) mod (k); with skew-twists, it makes sense to speak of an 
essential translation at k. The following generalizes the definition of essential trans- 
lation in section T3. 4. 1[ in particular defining the notion of an essential translation 
at k. 

Definition 3.3.13. We say that h has an essential translation at i if Li + \ mo( j k ^ 
id in some clean left descaling of h. 

In this case we also say that any decomposition / which is skew-linearly equiv- 
alent to h has an essentail translation before i, and even that the polynomial 
/ •= fk ° • • • /l has an essentail translation before i. 

We now prove that this is a good definition: that every decomposition admits a 
clean left descaling up to skew-conjugacy; that this definition does not depend on 
the choice of a clean left descaling, nor on the choice of a decomposition of a given 
polynomial up to skew-conjugacy. 

Lemma 3.3.13. For every decomposition f, some decomposition skew-linearly 
equivalent to f admits a clean left descaling. 

Proof. Lemma 13.2.71 proves that any decomposition / admits a left descaling. To 
obtain Li = id whenever hioLi is already ritty, replace (Mi, hi, Li) by (Mi, hioLi, id) 
whenever possible. For the last bit, write Mk = ToS for a translation T and scaling 
S and look at T" 1 o/oT^''. □ 

Any two clean left descalings of the same decomposition are almost equal: 
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Lemma 3.3.14. If (Mi, hi, Li) and (Bi, gi, Ai) are clean left descalings of the same 
decomposition; then M k = B k , and hi o Li — gi o Ai for all i. Further, hi — gi and 
Li = Ai unless gi (and therefore hi) is type J and Ai ^ id. 

Proof. First, observe that both M k and Bk are the scaling by the leading coefficient 
of the polynomial decomposed, so they are equal. Now choose linear Ti for 1 < k— 1 
witnessing linear equivalence: T\ o h\ o L\ = g\o A\, and TiohiO Li o T^~i — 9i ° Ai 
for 1 < i < k, and M k o h k ° L k o = B k o g k o A k . 

Starting with T\ and inducting up, it is clear that all Ti must be translations. 
Since T\ o h\ o L\ o A\ — g\ and h\ is not type C, T\ = id; inducting, we see that all 
Ti = id. This gives faj o Li — gi o for all i. Unless gi is type J, ft,, = <?j o Aj o L^ 1 
forces Ai o T^ 1 = id. If ^ is type J and Ai = id, then hi o Li = <?j is ritty, so Li 
must also be identity. □ 

Corollary 3.3.7. • One clean left descaling of f has an essential translation 

at i if and only if every clean left descaling of f has an essential translation 
at i. 

• If g is skew-linearly equivalent to f and both admit clean left descalings, 
then clean left descalings of one have an essential translation at i if and 
only if clean left descalings of the other do. 

Proof. The first assertion follows immediately form the previous lemma. The sec- 
ond one is true because linear functions witnessing skew-linear equivalence must be 
scalings, as can be seen easily from the proof of the previous lemma. □ 

Lemma 3.3.15. (1) If f has an essential translation at i, then (fi+i, fi) is 
not swappable. 

(2) Iff has an essential translation ati, then any decomposition tj-kf obtained 
from it by a Ritt swap at j also has an essential translation at i. 

Proof. (1) Replacing / by a skew-linearly equivalent decomposition if neces- 
sary, let (Mi, hi, Li) be a clean left descaling of /. Then (fi+i, fi) is swap- 
pable if and only if (hi + \oLi + \, hi) is swappable. That requires the existence 
of linear L, M, and N such that (L o hi+i o Li+x o A/ -1 ) and (M o hi o N 
are both ritty. Since hi and ftj+i are not type C, L = M = id, and then 
hi + i o Li + \ is ritty, contradicting the assumption that / has an essential 
translation ay i. 

(2) We just proved that j ^ i. If j ^ i + l,i,i — 1, the relevant factors are 
unchanged and the conclusion is immediate. 

If j = i + 1, let translations L, M, N witness the Ritt swap: b := 
Lo hi + 2° Li+2 M~ x and a := M o fe 1+1 o Li+io N~ x are ritty, and doc = boa 
is a basic Ritt identity. Since no factors are type G, L = M = id. Since 
a = hi + \ oLi + \ oN^ 1 is ritty but / has an essential translation at i, N ^ id. 
Now ti+i*/ = (. . . d, coN, hioLi . . .) has an essential translation at i unless 
c o N is ritty. Since N ^ id, if c o N is ritty , then c must be type J, which 
is impossible according to remark [5. 1.51 

If j = i — 1, let translations L, M, N witness the Ritt swap: b := 
L o hi o Li o M _1 and a :— M o hi—\ o L,_i o ./V -1 are ritty, and do c = boa 
is a basic Ritt identity. Since no factors are type C, L = M = id. Now 
ti-i * f — (■ ■ ■ h%+i ° Li+i, d, co N, . . .) still has an essential translation at i. 
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□ 

That was the induction step for the following Lemma. 

Lemma 3.3.16. If f has an essential translation at k, then any clean left descal- 
ing of any decomposition of any polynomial skew-conjugate to f has an essential 
translation at i. 

Proof. The statement of this lemma is just like the definition except for a univer- 
sal quantifier instead of the existential. We need to show that If some clean left 
descaling of some decomposition g of some polynomial g skew-conjugate to / has 
an essential translation at i, then any clean left descaling of any decomposition h 
of any polynomial h skew-conjugate to / has an essential translation at i. Now h 
is skew-linearly equivalent to w * g for some sequence of Ritt swaps w 6 Mfc, so we 
induct on the length of w, using the last lemma as the induction step, and Lemma 
13.3.71 for the base case where w is empty. □ 

Lemma 3.3.17. If h has an essential translation before 1, then h is cracked at the 
edge. 

Proof. The previous lemma shows that in any clean left descaling of any decompo- 
sition h of any polynomial skew-conjugate to h, Li ^ id. We will use Lemma f3. 3. 121 
to obtain the conclusion of this lemma. In Lemma 13.3.151 above, we showed that 
(h1,hk) is not swappable; it is also neither double-jumpable nor a mixing bowl, 
because in either of those, the first one or two Ritt swaps would preserve the essen- 
tial translation (again, according to Lemma l3.3.15p . and the last Ritt swap would 
have to be across an essential translation, which is impossible (again, according to 
Lemma r3.3.15j) . □ 

3.9. border-guards. Our analysis so far leaves two cases unexamined: the case 
when all factors of / are swappable, and one of them is type C; and the case 
when all factors of / are swappable, none are type C, and there are no essential 
translations. Both can be attacked with some further combinatorial machinery that 
we now develop. 

3.9.1. definition. We define a submonoid Gk of Bk] all words in Gk will leave fk in 
its place, though possibly altering it via Ritt swaps in the sense of Remark 13.1.61 

Definition 3.3.14. Let Gk be the free monoid generated by t\ through tk-2 and 
ifi and 7. 

Embed Gk in Bk by mapping ti to ti and tp to (tk-i4>) and 7 to (/3£fc— 1 ) • 

We define the action of Gk on Sf by identifying Gk with its image in Bk and 
denote the action by the same sumbol *. Gk gives just enough wiggle room to 
perform all the Ritt swaps: 

Proposition 3.3.6. Any word w in Bk is equivalent to (j> N w' or to (3 N w' for some 
integer N and some word w' in Gk ■ 

Proof. We take w € Bk, start from the right, and move to the left. At every step, 
we have a word Wb a df3 a <p b w g ood with Wbad € Bk and w goo d G Gk- We begin with 
Wbad = w and induct on its length; we begin with a = b = and w goo d empty. At 
the induction step, we must make Wbad shorter. Thus, it is sufficient to prove 
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Claim 1: If w goo d £ Gt, a,t 6 N, and s is a generator of Bk, then there are 
a', 6'eN and w' good £ G k such that s(3 a 4> b w good sa P a ' 4> b ' w' good . 

If s = j9, let a' := a + 1, 6' := 6, and u^ ood := u> good . 
If s = 4> and a > 0, let a' := a — 1, 6' := 6, and w' good :— w goo d- 
If s = <p and a = 0, let a' := a, 6' := b + 1, and u/ ood := w goo d- 
If s = tj for some «, we need 

Claim 2: For any ti a generator of Mk and any a, 6 S N, there exists u a generator 
of G fc and a',b' eN such that ^/3 a fc w (3 a '(j) b 'u. 

We prove Claim 2 by induction on (a + b) . 

Base cases: 

If a = b = and i =/= k — 1, we let a' = a, b' = 6, and u = tj. 

If a = 6 = and i = k — l, note that » <p/3tk-i — 4h, so let a' = 0, 6' = 1, and 
u = 7. 

Inductive cases: 

If a = and i ^ k — l, then tj</> « and we can apply the inductive hypothesis 

to t i+1 cj) h - 1 . 

If a = and i = k — 1, then 6^0. If 6 = 1, then we are looking at (tk-i<p), so we 
let a' = 6' = and it = -0. If > 2, note that tk-i<p 2 ~ 2 t 1 so we can apply the 
inductive hypothesis to t\(p b ~ 2 . 

If a 7^ and i 7^ 1, then ps /3ij_i and we can apply the inductive hypothesis to 

If a 7^ and i = 1, note that ii/3 w @ 2 tk-i4>- If a = 1, then we get t\[3(p b w 
fl 2 tk-i<j) b+1 and we can apply the second inductive step to tk-\(p b+1 ■ If a > 2, 
we get ti(3 a 4> b ~ f3 2 tk-\4>l3 a ^ 1 (j) b ~ P 2 tk^iP a ^ 2 (p b , and we can apply the inductive 
hypothesis to t k -if3 a ~ 2 4> b ■ 

Now we have proved Claim 2, which is sufficient to prove Claim 1, which is 
suficient to prove the proposition. □ 

We get a weaker analog of Lemma 13.3.101 

Lemma 3.3.18. For any word w £ Gk, there are Wi £ Gk such that w ps W2W1 
and 7 does not appear in w\ and tp does not appear in W2 ■ 

Proof. Given w £ Gk, we find an equivalent word w' that has no substrings of the 
form ipwy for some u £ Mk—i- Clearly, w' is the desired word. To construct «/, we 
prove a 

Claim: for any u £ Mk—i there is a word v' £ Mk-i such that tpuj ~ v' or 
tpwy w jtk-2ipv'. 

Then replacing a substring ipwy by one of these does not increase the number 
of instances of tp and 7 in a word, and straightens out one tp, 7 pair in the wrong 
order. Thus, after finitely many such operations we obtain the desired w' . 

Proof of Claim: Without loss of generality, we may assume that u £ Mk-i is in 
reverse first canonical form, i.e. either u — v or u — t\V where t\ does not appear 
in v £ Mk-i- Then ^147 ~ v' in the first case, and ipwy ~ "ftk-2ipv' in the second, 
for some v' £ Mk-i- □ 

The purpose of the previous lemma is that with a sufficiently vicious fk guarding 
the border, the number of instances of ip in w\ and 7 in W2 is severely limited. 
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Lemma 3.3.19. If fk is neither a monomial nor type C, then the correspondences 
encoded by the words Wi in Lemma \3.3.1 8\ are (up to skew-conjugacy) graphs of P ni , 
where m are bounded by the product of the in- and out-degrees of fk. 

Proof. The careful reader will complain that type J indecomposables do not have a 
well-defined out-degree; but we are only after a bound, and for a given type J or coJ 
indecomposable fk, there certainly is a bound on the out-degrees it can pretend to 
have. 

We sketch the proof for u>i <= Gk in which 7 does not appear; the proof for W2 in 
which tp does not appear is analogous. We write wi — v n ipv n -\Tp . . . viipvo for some 
sequences of Ritt swaps Vi G Mk-i- We induct on n. Without loss of generality we 
may replace / by a decomposition where fk =: a is itself ritty and not a monomial. 

Since ifcUo*/ is defined, the second leftmost factor of VQ-kf is linearly equivalent 
to a monomial: for some linear L and M, and for some prime (not necessarily 
odd) p, we have vo * f = {a, L o P p o M, . . .) and tkv * / = (P p , b o M, . . .), where 
(aoL)oP p = P p o b is a basic Ritt identity, so b cannot be type J because of remark 
13.1.51 Then ipVQ-kf = (boM, . . . ,P p ) =: g, and the correspondence encoded by this 
equation is the graph of P p . 

Note that g satisfies the hypotheses of the lemma. The in- and out-degrees of 
its leftmost factor b are well-defined up to linear relatedness. Even if a was type 
J, the out-degree of (a o L) is well-defined (not up to linear relatedness). Now the 
out-degree of b is i times the out-degree of (a o L). We now apply the inductive 
hypothesis to v n ip ■ ■ ■ tpvi * g. □ 

As before, for the purpose of irreducible skew-invariant curves, we can cancel 
(3<j), and therefore yip. Since monomials commute, if n\ and 712 in the lemma have 
a common factor, we may bring those two together, and then cancel them. So 

Corollary 3.3.8. If fk is neither a monomial nor type C, and w 6 Gk, then the 
only skew-invariant irreducible component of the correspondence encoded by w is 
defined by x ni = y n2 for relatively prime n\ and ni bounded by the product of the 
in- and out- degrees of fk- 

Together with the last item of Lemma l3. 3.181 this characterizes correspondences 
coming from skew-twists for polynomials that have at least one factor that is neither 
a monomial nor type C. 

Corollary 3.3.9. For any decomposition f with at least one factor that is neither 
type C nor a monomial, there exists an integer Nf such that for any w G P>k 
with w * / defined, there are Wi G Gk and integers a < k and N such that w ? 
4> N ' W2W\(t> a or w «y /3 N W2"Wi4> a ; w\ contains no occurrences 0/7 and at most Nf 
occurrences ofip; W2 contains no occurrences of if) and at most Nf occurrences of y; 
and the correspondences encoded by Wi are graphs of monomials of degree at most 
N f . 

The Nf in the corollary is the product of in- and out-degrees of the factor that 
is neither type C nor a monomial, and the maximum possible such if the factor is 
type J so that his out-degree is not well-defined. 

This leaves two questions. What about polynomials each of whose factors is type 
C or linearly related to a monomial? How do correpondences coming from Theorem 
13.31 interact with those coming from skew-twists? 
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3.10. Maximal Odd chebyclumps. Now we deal with the possibility that all 
factors of / are monomials or type C; and at least one of the factors is type C. We 
make heavy use of chebyclumps introduced for the proof of Theorem 13.31 

There, we proved that in any decomposition, type C factors occur in clumps 
which are well-defined up to linear equivalence, and which persist (up to invading 
quadratic factors) under Ritt swaps. We also proved that Ritt swaps involving 
two odd Chebyshevs can only occur within these clumps; it is clear that the only 
other factor that can Ritt swap all the way through, or even any distance into, an 
odd chebyclump is Q. This makes odd chebyclumps effective borderguards. We 
begin by extending the technical results to the new context of skew-twists, showing 
that chebyclumps are invariant under skew-linear equivalence, and that they persist 
under single skew-twists. First, we must adjust the notion of maximality. 

Definition 3.3.15. A maximal chebyclump (gk, ■ ■ ■ ,gj) of a decomposition (gk, ... ,g±), 
with j > 1, is called skew-maximal if (g\ , 9k, ■ ■ ■ ,9j) is not a chebyclump. 

Lemma 3.3.20. Suppose that a decomposition f contains a maximal chebyclump. 
Then there is a plain skew twist g := (jf * f such that (gk, . . . ,gj) is a maximal 
chebyclump for some j . 

Proof. Suppose that (/;,, . . . f a ) is a maximal chebyclump of /; then (gk, ■ ■ ■ gk-b+a) 
is a maximal chebyclump of g := 4> k ~ b * f . □ 

Now we begin to look at correspondences encoded by the words Wi in Lemma 
13.3.181 acting on a decomposition that has a maximal chebyclump on the left. 

Lemma 3.3.21. Suppose that f is a decomposition of a polynomial f of degree d -2 
for some odd o' ; suppose that f is not skew-conjugate to a Chebyshev polynomial; 
suppose that (fk, ■ ■ ■ f a ) is a maximal chebyclump of degree o ■ 2 r ; and suppose that 
the degree of fk is odd. Let wi £ Gk not contain any instances of 7, and suppose 
that wi * f is defined. Then the correspondence A Wl is the graph of a chebyshev 
polynomial Cjv where N divides o ■ 2 t+1 . 

Proof. Write w\ = v n ip . . . i/jvq, and replace / by a skew-conjugate decomposition 
such that fi = C Pi for each a < i < k, where pi may be 2. 

We induct on the odd part of the degree of the chebyclump. In the decomposition 
g := vq * /, (g^, ■ ■ ■ gb) is still a chebyclump for some b < k — 1, with the same odd 
part of the degree as the chebyclump (//., . . . f a ) in /. We may again assume that 
<?i = C qi for each b < i < k. Then tpvo*f = 4>*g = 4>tk-i*g — 4>*(C qk _ 1 , C qkl ■ ■ ■) — 
(C qk , . . . , C qk l ) =: h The correspondence encoded is the graph of C qk l . If qk-i 
is odd, then the degree of the maximal chebyclump (hk, ■ ■ ■ , hi) is l/qk-i times 
the degree of the chebyclump (fk, . . . f a ) in /; in particular, the odd part of the 
degree of the new chebyclump is less than the odd part of the degree of the old 
chebyclump, so we have completed the induction step. 

Note that, since / is not skew-conjugate to a Chebyshev polynomial, the whole 
h is not a chebyclump, so the odd hi — C' qk l cannot rejoin the chebyclump 
(hk, ■ ■ ■ ,hi) via Ritt swaps. 

We now tell a story in order to avoid gruesome notation. What is different when 
<7fc_i = 2? What prevents us from inducting on the whole degree of the chebyclump 
is the fact that it is not invariant under Ritt swaps, and may indeed grow from a 
decomposition h to the decomposition Vi * h as new quadratic factors join the 
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chebyclump. This can happen in two distinct ways: either some quadratic fi for 
i < a from the original decomposition finds his way into the chebyclump (which 
is accounted for by 2'), or in some tjwi . . . ipvo * f the rightmost factor is a new 
quadratic who has just been pushed across the border by ip, and Uj+i brings this 
quadratic all the way to the chebyclump on the left of the decomposition. This may 
happen once, but if it happens twice, Lemma l3.3.4i a little bit of care, and a whole 
lot of notation forces the whole / to be a chebyclump, and / to be skew-conjugate 
to a chebyshev polynomial. □ 

A similar proof yeilds the corresponding statement for W2 G Bk in which ip 
doesn't appear, and together with Lemma 13.3.181 thev give 

Proposition 3.3.7. Given a word w £ and a decomposition f with at least one 
type C factor, let o be the degree of the largest maximal odd chebyclump in f , and 
let t be maximal such that 2* divides the degree of f. Suppose that w* f is defined. 
Then there there are words Wi in Gk and integers a < k and N , such that 

w 4> N W2Wi(f> a or w ~ (3 N W2W\(j) a 

and 

• w\ contains to instances of 7 and the correpondence from f to g encoded 
by wi* f =: g is the graph of Ca, where A divides o2 t+1 ; and 

• W2 contains no instances of <fi and the correpondence from h to g encoded 
by W2~kg =: h is the graph of Cb, where B divides o2 t+1 . 

3.11. skew twists summary. We have now described all correspondences that 
arise from skew-twists. We have three possible conclusions, the strongest for cracked 
polynomials in Corollarv l3.3.6[ and two weaker ones in Corollarv l3.3.9l and in Propo- 
sition [3X7] 

Corollary 13.3.61 applies to polynomials with unswappable factors and to polyno- 
mials with an essential translation (and, therefore, no type C factors). 

Proposition 13. 3. 71 applies to trivial polynomials with at least one type C factor. 

Corollary 13.3.91 applies to polynomials with at least one factor that is neither 
type C nor a monomial; it is useful if some factor is type C, or if there are no 
essential translations. 

This exhausts trivial polynomials, as a polynomial all of whose factors are linearly 
related to monomials either has an essential translation, or is skew-conjugate to a 
monomial and therefore is not trivial. 

Actually, the vast majority of decompositions are cracked, yielding the much 
stronger conclusion of Corollarv l3.3.6[ but we do not wish to bore the reader with 
more computations and a long list of exceptions. 

3.12. How skew-twists interact with correspondences from Theorem 13.31 

Correspondences from Theorem 13.31 are composed of pieces in each of which tt is a 
monomial P p for some prime p. When P p is also a compositional component of /, 
these turn out to be skew-twists. Otherwise, they commute with skew-twists. 

Proposition 3.3.8. Suppose that fi = P p in a decomposition f of f , and tt = P p 

gives a morphism from f to some g. Then this morphism is actually a skew twist, 
i.e. there is a decomposition h of f such that h\ = P p . 
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Proof. Since the foreign P p from tt cannot Ritt swap with the native P p from /, it 
is the native P p from / that must end up all the way at the beginning after the 
sequence of skew- twists that turns (P p , fk, ■ • ■ /i) into (gk, . . . ,g\, P p ). Putting that 
sequence into first canonical form shows that the native P p can end up all the way 
at the beginning before the foreign P p does anything, which is precisely what we 
want to show. □ 

So this correspondence is a single skew-twist for polynomials, and it is a sequence 
of Ritt swaps followed by a single skew-twist followed by a sequence of Ritt swaps for 
the decompositions. These have already been completely characterized in various 
cases, summarized in the next section. This leaves the case when tt = P p is not a 
compositional factor of /. 

Proposition 3.3.9. Suppose A is a correspondence from g to h coming from skew- 
twists, and that n gives a morphism from f to g as in Theorem \3.S\ Then there 
is some d which admits a correspondence T> from f to d coming from skew twists, 
and such that d, n, and h are as in Theorem \3.3\ . Indeed, T> is encoded by the same 
word w G Bk as A. 
And conversely. 

Proof. By induction, it is sufficient to show this for single Ritt swaps and single 
skew twists. Both are completely obvious: Ritt swaps with P p only change the 
degrees of p in the in- and out-degrees of the factors of /, which does not affect the 
ability of other factors P q of / to swap, because that ability depends on factors of 
q in the in- and out-degrees. □ 

This proposition shows that correspondences coming from skew-twists commute 
with those coming from Theorem 13.31 

Remark 3.3.3. Correspondences from Theorem 13.31 are always defined by x" = y m 
for some integers m and n. If m and n are not relatively prime, this correspondence 
is reducible, as it is the correspondence x p — y p composed on some other stuff. Its 
irreducible components are given by x = for pth roots of unity £. In the context 
of dynamics, when we assume that the algebraic closure of the prime field sits inside 
the fixed field of a, only the diagonal among these is invariant, and the others are 
periodic, in contrast with correspondences of this form that came from skew twists, 
where the other components were strictly pre-periodic. 

This is the last bit we needed for a complete description of correspondences 
between <r-varieties given by polynomials. 

3.13. answers. We prove in the next section that all skew-invariant curves come 
from skew- twists and from Theorem 13.31 and thus are listed in this theorem. 

Theorem 3.4. Given two trivial polynomials f and h, any (/, h) -skew-invariant 
curve coming from skew twists and Theorem \3.3\ is of the form A3 o A2 ° A\ where 

• A3 is the graph of an initial compositional factor a of f: it is an (/,/)- 
skew-invariant curve, where f = b o a and f = b o a a . 

• Ai is a (f , g)- invariant curve, which is 

— the diagonal if f has an unswappable factor, or if it is linearly related 
to a monomial. 
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— defined by x — y if some indecomposable compositional factor of f 
is not linearly related to any monomial or Chebyshev polynomial. In 
this case, M and N are relatively prime and bounded by the degree of 
that factor. 

— defined by Cm{%) = Civ(y) if some indecomposable compositional fac- 
tor of f is linearly related to some Chebyshev polynomial. In this case, 
M and N are relatively prime and each divides 2 cleg/. 

• A\ is the graph of & a ) o g^ n for some initial compositional factor c of 
g, where g = do c and h — d^ a > o c^ 7 ' . 

Proof. With Proposition 13.3.9] it is sufficient to insert one correspondence coming 
from Theorem 13.31 in a place of our choice in the word in Bk in Corollary 13.3.61 
Corollary 13.3.91 or Proposition 13.3.71 Correspondences from Theorem 13.31 do not 
appear for polynomials with an unswappable factor, nor for trivial polynomials 
linearly related to a monomial. In both other cases we choose to insert it between 
W2 and w±. To see that this works in Proposition 13.3.7] note that since there is 
an type C factor, the correspondence coming from Theorem 13.31 is the graph of x 2 , 
which is not itself a skew-twist only if / has no quadratic factors. □ 

In fact, A2 is the diagonal for the vast majority of polynomials, but the precise 
characterization of exceptions is tedious. For example, A2 is the diagonal if / has 
an essential translation, or if some decomposition of / has two factors and fj 
with deg in (/i) • deg out (f t ) and deg m (/j) • deg out (/j) are relatively prime. 

4. Skew invariant varieties 

In this section we complete our classification of the skew-invariant varieties for 
a- varieties of the form $ : A™ — > A™ where $ is given by a sequence of univariate 
polynomials. The results of Section [3] can be used directly to describe the skew- 
invariant plane curves. To describe the skew-invariant varieties in higher dimensions 
we use the model theory of difference fields to reduce to the case of plane curves. 
After recalling two important ideas from model theory, triviality and orthogonality, 
we show how the problem of classifying skew-invariant varieties reduces to the cases 
of linear dynamics, dynamics defined by monomials, and the cases considered in 
Section [31 We then dispose of the first two cases and conclude by combining these 
results. 

4.1. Difference algebraic geometry. Recall from Section[2]that a difference field 
is a field K given together with a distinguished endomorphism a : K — > K . The first 
order theory of difference fields admits a model companion, ACFA, axiomatized by 
saying that K is difference closed in the sense that it is algebraically closed, a is an 
automorphism, and for every irreducible algebraic variety X over K and irreducible 
subvariety r C X x X a which projects dominantly in both directions the set of 
points {X,Tf(K,a) := {a G X{K) : {a,a{a)) G T(K)} is Zariski dense in X. To 
say that ACFA is a model companion of the theory of difference fields includes the 
assertion that every difference field extends to a difference closed field. Note that if 
X is an irreducible variety over a difference closed field (K, a) and / : X — > X a is 
a dominant map making (X, f) into a a- variety, then the axioms for ACFA include 
the assertion that {a G X(K) : a (a) — f(a)} is Zariski dense in X. These facts, 
that every difference field extends to a difference closed field and that the solutions 
to a(x) = f(x) become Zariski dense, are what allow us to deduce strong structure 
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theorems for algebraic dynamical systems and cr-varieties from the corresponding 
theorems about definable sets in difference closed fields proven by Chatzidakis and 
Hrushovski in ^6 . 

Before we discuss orthogonality and triviality, let us introduce one new notion. 
If (X,f) is a a- variety over some difference field (K,a), then a a-subvariety is 
a subvariety Y C X for which / \ Y : Y — > Y a . This is an important notion, 
and one which is directly implicated by the problem of the classification of the 
skew-invariant varieties, but we need to consider a slightly stronger condition. A 
subvariety Y C X is difference subvariety if in some difference closed field (L, a) 
extending (K, a) the set of points {a € Y(L) : f(a) — c(a)} is Zariski dense in 
Y. Provided that K is algebraically closed, one may replace the phrase "in some 
difference closed field" by "in every difference closed field." 

Two notions from geometric stability theory, a part of modern model theory, are 
important for us. Fortunately, their usually complicated definitions based on the 
theory of forking can be replaced by simpler statements in our special setting. 

Definition 4.0.1. Let (K, a) be an algebraically closed difference field and (X,f) 
and (Y, g) two irreducible cr-varieties over (K , a) with / and g dominant. We say 
that {X,f) and (Y, g) are almost orthogonal over K, written (X,f) l. a K {Y,g), if 
every difference subvariey of (X x Y, (/, g)) is a product of a difference subvariety 
of (X, f) with a difference subvariety of ( Y, g) . We say that (X, /) and (Y, g) are 
orthogonal if for every difference field extension (L, t) of (K, a) one has (Xl, f) 
(Yk, g) where we have written Xl and Y^ for the base changes of these varieties to 
L. 

Remark 4.1. Orthogonality is usually defined at the generic level. What we call 
(almost) orthogonality is usually called full (almost) orthogonality. As we are 
concentrating on the structure of difference varieties rather than on difference fields, 
which would be better encoded by generic behavior, we shall take full orthogonality 
as primitive. 

Remark 4.2. The distinction between almost orthogonality and orthogonality is 
real, but in the cases that concern us, ADs defined by univariate polynomials of 
degree at least two over fields of characteristic zero, the phenomenon does not 
appear. However, it is relevant for linear dynamics: for instance, to obtain an 
isomorphism between (A , id) and (A 1 , x >—* x + 1) one needs parameters beyond the 
fixed field of a. We will see shortly that some linear maps that are not isomorphic 
over the fixed field are almost-orthogonal over the fixed field, while others are not. 

An important result for us is that orthogonality passes from pairs to products, 
though this result is not true for almost orthogonality. 

Fact 4.2.1. If(K,a) is a difference field, (X 1 ,f 1 ), . . . (X r , f r ) and (Yi, 51), . . . ,(Y S , g s ) 
are a-varieties over (K,a) for which _L (Yj,gj) for all i and j, then 

UI-V,.</: /,h ..:</, <,ji. 

Two contradictory notions of triviality for cr-varieties appear in the literature. 
Sometimes (see for instance dUUU), one sa y s a CT " var i et y closely related to one 
of the form (X, idx) is trivial; this is not the notion we mean. Our triviality comes 
from the model-theoretic notion of forking triviality, first isolated in the context of 
stable theories (see [2]) and then successfully used in the context of difference fields 
(see [6] and [8] for the development of theory of forking in difference fields). The 
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following is really a theorem, but it suffices as a definition for the purposes of this 
paper: 

Definition 4.2.1. Let (K,a) be an algebraically closed difference field and (X,f) 
a a- variety over (K, a) . We say that (X, /) is trivial if for every n G Z + , every 
irreducible difference subvariety yc(X xn ,/ x ")isa component of the intersection 
Di<i<j<n Tij'XhjZ where ir tJ : X xn -> X x2 , given by (xi, . . . , x n ) h-> (xi,Xj), is 
the projection onto the i th and j th coordinates. 

The main theorem of the first author's doctoral thesis p~2] gives an explicit 
characterization of the rational functions / for which (P 1 , /) is trivial. We abuse 
notation by saying "trivial polynomial /" to mean "polynomial / such that (P 1 , /) is 
trivial." Similarly, we write / _L g to mean (A 1 , /) _L (A 1 , g). We state the (simpler) 
special case of that theorem, where / is a polynomial and the characteristic of the 
field is zero, which is relevant to this paper. 

Theorem 4.3. Suppose that f is a non- constant, non-linear polynomial over a 
difference field of characteristic zero, and suppose that f is not skew-conjugate to 
a monomial or a Chebyshev polynomial. Then the a -variety (A 1 ,/) is trivial. 

The hypotheses of this theorem are as weak as possible: the graph of multiplica- 
tion witnesses that monomials are not trivial. Each Chebyshev polynomial admits a 
2-to-l cover by the corresponding monomial, inheriting its rich structure. Triviality 
is invariant under isomorphisms, so in particular under skew-conjugation. 

It is fairly easy to see that, when specialized to the case of X being a curve, triv- 
iality is equivalent to the nonexistence of families of difference subvarieties of X 2 
other than horizontal and vertical lines (see Chapter 2 of [2] for details). From the 
technical results of this paper, one immediately concludes that polynomials satis- 
fying the hypotheses of the theorem do not admit families of difference subvarieties 
of X 2 other than horizontal and vertical lines, in effect reproving Theorem 14.31 

Combining the Zilber trichotomy for minimal types in ACFA proved in [6, with 
Theorem 14.31 above and some easy observations, we obtain: 

Proposition 4.3.1. Linear polynomials are non- orthogonal to each other and or- 
thogonal to all other polynomials. For polynomial f , P n JL f if and only if f is 
skew- conjugate to P n or C n . 

From these observations, we conclude that the difference varieties for coordinate- 
wise polynomial actions may be decomposed into pieces corresponding to each of 
the three classes from the Zilber trichotomy. 

Proposition 4.3.2. Suppose that polynomials the polynomials $i are linear for 
1 < i < a, skew-conjugate to monomials and Chebyshevs of degree > 2 for a + I < 
i < b, and none of those for b < i < n and that $ : A" — > A™ is given by 
(xi, . . . , x n ) i ► (f$>i(xi), . . . , $ n (x n )). Then any irreducible difference subvariety of 
(A", $) is of the form AnBnC, where A = A x A("~ a ), B = A a x B Q x A("~ b \ 
C = A a+b x Co, and each of A, B and C is a difference subvariety. Further, Bo 
similarly breaks into pieces according to the degrees of the 3>j . 

The possible Bo were already classified via the study of one-based groups in differ- 
ence closed fields (see 0[Z])- Indeed, non-linear monomials and Chebyshevs define 
nontrivial, modular difference varieties, which were the key tools in Hrushovski's 
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proof of the Manin-Mumford conjecture [IT]. We pay little attention to them in 
this paper as their difference subvarieties, such as Bq above, are well understood. 

If 4> : — > G^ is given by (xi, . . . , x g ) i— ► (x^ 1 , . . . , x^f 9 ) for integers Mj > 2, 
then every difference subvariety of (G^, (f>) is a finite union of translates of algebraic 
subgroups of G*^. 

To deal with Chcbyshevs, pull back the coordinates on which they act by ir(x) = 
x + — to obtain a sub- er- variety of a er- variety defined by monomials. It bears noting 
that not every connected algebraic subgroup of Gf„ is a difference subvariety of 

We have already classified the possible Co without having said so explicitly. 
Indeed, as each ff is a trivial, Co must be a component of C\ a +b<i<j<n 7r i r / 7ri J : (Co)- 
Thus, it suffices to describe the invariant curves for (/j, /j) : A 2 — > A 2 . 

An easy computation of ramification indices yields the following proposition. 

Proposition 4.3.3. If f and g are trivial polynomials and an irreducible curve 
C C A 2 is a sub-a -variety of (A 2 , / x g), then there are polynomials tt, p, and h 
such that (a, b) S C if and only if there is some c with 7r(c) = a and p(c) = b and 
the following commutes: 

A 1 / . A 1 



"I 



(19) 



p 



p 

' '< A 1 



Proof. Let a, be the projection of C onto i th coordinate and let h : C — > C" 7 be the 
restriction of / x g to C. 

We need to show that there is a birational isomorphism (3 : C — > A 1 such that 
all of a, o and (3 a h o /3 _1 are polynomials. First, we normalize the curve C, 
then we note that deg h = deg / > 1, so C must have genus or 1. 

Let {ai, . . . , a m } :— a]" 1 (oo), and let n be the degree of /. 

Note that h must be a bijection from {ai, ...,a m } to {aj, ...,a^}, since h must 
take the ai-fiber above oo to the aj-fiber above /(oo) = oo, and cannot take any 
other points into that fiber since / does not take any other points to oo. Let r be 
the permutation of {I, . . . , m} such that g(a,t) = a^uy 

Let us compare the two ways to compute the ramification index of the diagonal 
of the diagram at af. 

e ai (ai) ■ e/(ai(oj)) = e^ai) ■ e a * (Ji{a%)) 
Since an(aj) = oo and e/(oo) = n; h(a t ) = a*,* and e a »(a^) = e ai (a T ( i )); and 
e^(any point) < deg(/i) = n, the equation becomes 

eai(fflt) • n = e~ h {ai) ■ e ai (a T ^)) < n ■ e ai (a T ^) 
So for all i, e ai (ai) < e ai (a T u\), with equality iff e^(aj) = n. 

Si e ai ( a i) = J2i e ai ( a r(i)) since t is a permutation. 
Therefore, for all i, e Q , 1 (oj) = e ai (a T ^) and e^(a,) = n > 1. 
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Notice that, in particular, h ramifies: C admits a ramified, separable, non- 
constant morphism h of degree grater than 1 to a curve C a of the same genus, so 
the genus of C must be 0. 

Since h totally ramifies at all aj, there are at most two such points. If there are 
two points, they are either fixed or switched by h, in cither case contradicting the 
triviality of / and g as in the first case h is conjugate to x n and in the second to 
a the rational funtion x~ n . So the unique point P in ar (oo) is the unique point 
where h is totally ramified, which by the same argument must also be the unique 
point in a% (oo). Any {3 : C — > A such that (3(P) = oo works. 

□ 

Finally, we need to say something about Aq. For a coordinatewise linear <f> : 
A k — > A , the difference subvarieties of (A fe ,$) are very easy to describe: over a 
difference closed field the a- variety (A fc ,<£>) is isomorphic to the a- variety (A fe ,id), 
whose difference subvarieties are exactly the subvarieties defined over the fixed field 
of a. To exhibit the isomorphism, it suffices to find one solution of the equation 
a[x) = for each % where we write $(xi, . . . ,xt) — ($i(xi), . . . ,$k{%k))- The 

question is less easy for ADs as we work over the fixed field of a, so these parameters 
are not available to us. This section clarifies this situation. 

Every linear polynomial is a (possibly trivial) scaling or a translation by 1, up 
to conjugation by linear polynomials. Therefore, up to isomorphism of dynamical 
systems, a coordinate-wise linear polynomial action acts on each coordinate by 
either scaling or by adding 1. The dynamical system on A 2 given by $>(x, y) — (x + 
l,y+l) is isomorphic to the one given by G(z, w) = (z, w + 1) via the isomorphism 
(x,y) i— > [x — y,y), so we may also assume without loss of generality that the 
dynamical system acts by translation on at most one coordinate. We will reduce 
the more difficult case, where $ indeed acts on one of the coordinates by translation, 
to the easier case where <J> acts only by scalings. 

That is, we may reduce to the study of algebraic dynamical systems on A r or 
A r+1 of the form $ : A r+1 — > A r+1 given by (x\, . . . , x r ,y) i— > (Ai^i, . . . , X r x r ,y + 
I) where each is a nonzero scalar or $ : A' ' — > A r given by {x\, . . . ,x r ) i— ► 

(y/K'^X /p ^ • • • ! j^j"*3>j"J« 

It is clear that the coordinate hyperplanes defined by xi = are invariant and the 
the restriction of $ to any such has the same form, but with one less scaling term. 
Hence, to analyze the invariant varieties, it suffices to consider $ on GJ„ x G a . Let 
H < be the smallest algebraic group containing (Ai, . . . , A r ). Then every $- 
orbit must be contained in a coset of H x G a . If M is the index of H°, the connected 
component of H, in H, then we see that every orbit of $ oM is contained in a coset of 
H° xG a . As the ^-invariant varieties are also <& oM -invariant, it suffices to classify 
the latter. As H° is a connected algebraic torus, possibly after base change, it is 
isomorphic to for some t < r and as the action of <I> is semisimple, relative to 
some isomorphism with &- m x G Q , the action of $ on H° x G a takes the same form 
as that of (f>. Hence, we may reduce to the case that Ai, . . . , \ r are multiplicatively 
independent. 

With the next lemma we use the Skolem-Chabauty [201 H] method to deduce that 
there cannot be any interesting algebraic relations on a $ orbit. In the following 
proof we make use of the p-adic exponential function and the consequence of the 
Weierstrafi Division Theorem that a convergent p-adic power series in one variable 
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which vanishes on infinitely many p-adic integers must be identically zero (see 
chapter 6 of [9] for details). 

Definition 4.3.1. If (X,F) is a dynamical system and P G X(K), then Op(P) C 
X(K) is the (forward) orbit of P under F. 

Lemma 4.3.1. Let K be an algebraically closed field of characteristic zero and 
Ai,...,A r G K x a sequence of multiplicatively independent elements of K . Let 
i> : xG a -> G r m x G a be given by (x±, . . . , x r , y) i— > [\\X-y, . . . , X r x r , y + 1). Let 
P = (ai, . . . , a r , b) G (G^j x G a )(K). Then 0<s>(P) is Zariski dense in G r m x G a . 

Proof. Let us start with a few reductions. 

First, the lemma is obvious when r = 0. So, we may assume r > 0. 

Secondly, If ^(^i, . . . ,x r ,y) := {a\Xi, . . . ,a r x r ,y + 6), then W is an automor- 
phism of <G r m x G Q (as an algebraic variety), takes . . . , 1, 0)) to 0$(P) 
and preserves the product structure. Hence, we may, and do assume that P = 
(1, . . . , 1, 0). Thus, C«(P) = {(A™, . . . , A» n) : n € N}. 

Finally, if the lemma fails, then we may assume that there is an irreducible 
hypersurface Y for which Y(K) n 0$(P) is Zariski dense in Y. Indeed, if Z C 
G^ x G a is an irreducible subvariety which contains a Zariski dense set of points 
from 0$(P), then the projection of Z to GJ„ must be either a point or all of G^ by 
the Mordell-Lang theorem for the multiplicative group. As Cj>(P) is not contained 
in F x <G a (K) for any finite set P, we see that some component of CA$(P) must be 
a hypersurface. 

So, we can find an irreducible polynomial G(xi, . . . , x r , y) G K[x\, . . . , x n , y] 
defining a hypersurface Y containing a Zariski dense set of points from 0$ (P) . 

As all of these data are defined over a finitely generated extension of Q, by 
choosing an appropriate rational prime p we may assume that G 6 Z p [xi, . . . , x r , y] 
and that each Ai is a nonzero p-adic number. 

Let us write A^ = u>i exp p (/ii) where coi is a (p — 1) root of unity and exp p is 
p-adic exponential function. The function z i— > Af is p-adic analytic on each coset of 
N +pZ p of Z p (where JVgZ) and is given by the formula z i— > lo n exp(/i^(z — AT))- 
Hence, the function g : Z p — > Z p given by g{z) :— G(Af , . . . , A^, z) is itself analytic 
on each coset of pZ p . 

From our hypotheses, g vanishes on infinitely many natural numbers, and thus, 
because it is piecewise analytic, on some coset of pL p . Thus, on some coset of pL p 
the series expansion for g is identically zero. 

Let H{x u ...,x n ,y):=^+ £[=i ^X t §§-. 

Differentiating, we see that g'(z) = H(\\ , ...,\*,z). That is, H also vanishes 
on an infinite subset of C$(P) and is therefore an element of the ideal generated 
by G. So, there is a number a with H = aG. 

Let us write G in multi-index notation as G(X, y) — X)/eN r jeN Gi.jX I y : ' . Then 
we compute that H(X,y) = J2ieN- J£n((Yh=i + U + 1 )G/ J +i)a; / y- J . 

As Y is not horizontal, there must be some nonzero multi-index / for which 
there is some j with Gij ^ 0. Choosing j maximal with this property we have 
(X)i=i f'il^Gij = aGij. If a = 0, then we obtain a nontrivial linear dependence 
amongst the /ij's contrary to the multiplicative independence of the A^'s. In any 
case, as Y is irreducible, there must be some other multi-index K (possibly an n- 
tuple of zeros) for which there is some I with Gk,i ^ 0. Again choosing t maximal 
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with this property we deduce the Ya=i l^i^i = a. Hence, J2i=i fn(Ii — Ki) = is 
a nontrivial linear dependence. With this contradiction we conclude the proof. □ 

Our calculations in the case of linear dynamics yield the following proposition. 

Proposition 4.3.4. Let <f> : A™ — » A" be given by (x\, . . . , x n ) i-> . . . , <&„(£„)) 

where each $j is linear. Let r be the dimension of the Zariski closure of the sub- 
group of GL„ generated by (<E>i, . . . , $„). Then there is an isomorphism of ADs 
: (A",<f>) — » (A r ,\E f ) x (A™~ r ,id) and i/ie irreducible ^-invariant varieties are ex- 
actly those of the form _1 (V r x Y) where V C A r is an intersection of coordinate 
hyperplanes andY is any irreducible subvariety of A n ~ r . 

Let us put all of these observations together into a refinement of Proposition ^. 3. 2l 

Theorem 4.4. Suppose that polynomials $^ are linear for 1 < i < a, skew- 
conjugate to monomials and Chebyshevs of degree > 2 for a + 1 < i < b, and 
none of those for b < i < n and that $ : A™ — > A" is given by (x±, . . . ,x n ) ^ 
($i(xj), . . . , & n (x n )). Then any irreducible difference subvariety o/(A ra ,$) is of 
the form A n B n C, where A = A x A("~ a ), B = A a x B Q x A(™- b ) 7 and eac/i of 
A, B and C is a difference subvariety. Moreover: 

• Aq is described by Proposition \4-.S.4\ 

• Bq is a quotient by a finite group action of a translate of an algebraic torus. 

• Co is a component of f] a +b<i<j<n n 7,j ni J ! (^o) an< ^ n %j(^o) is a point, a 
line, or a curve described by Theorem \3.4\ 

5. Applications 

In this section we use the results from Section |4] to answer some open questions 
about the model theory of difference fields and the arithmetic of algebraic dynamical 
systems. 

5.1. Trivial minimal sets in ACFA. This section is intended mainly for logi- 
cians. 

In this section we work in a sufficiently saturated model [L, a) of ACFAo. For 
a polynomial / we write /' for {a 6 L : &(a) — f(a)}. For a 1-type p over some 
small substructure of L we write p 6 /" to mean that the formula cr(x) = f(x) is an 
element of the type p, or, equivalently, that for any realization a \= p one has a E 
Whenever we say that some property P of a polynomial is definable we mean that 

for each natural number d, the set {(do, ■ ■ ■ , a>d) '■ X)i=o aiX% nas property P} is a 
definable set. 

Theorem l4.4l allows us to describe the structure of trivial minimal sets of the form 
(A 1 ,/)' for a trivial polynomial /. In particular, we try to answer two questions 
definably: 

Problem 5.0.1. (1) Given f, for what g are there non- orthogonal types p G /' 
and q £ g*? 

(2) Given p € /" and q E g* , are they non-orthogonal? 

Another theorem from the first author's thesis translates these questions into 
the language of this paper: 
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Theorem 5.1. If p(x) £ /" is non- orthogonal to q(y) E g*, then there are polyno- 
mials it, p, and h such that 

ir : h* -> /» and p : -> 
and the formula 6(x,y) := (3z7r(z) — x A p{z) = y) witnesses nonorthogonality. 

Conversely, it is easy to see that given such h, 7r, and p, any p € /' and q G <?" 
are non-orthogonal as long as 

(3ztt(z) — x A o~(z) — h(z)) € p and 

(3zp(z) = y Acr(z) = h(z)) E q 

Otherwise, there might or might not be some other formula witnessing non- 
orthogonality between p and q. 

Thus, the questions we need to answer are 

Problem 5.1.1. (1) Given f , for what g are there h, tt, p such that n : hs — ► /" 
and p:h* ->g*? 

(2) Given f and g, what are the possible h, n, p such that n : h? — > /" and 
p:h*->g*f ~ 

As was mentioned above, morphisms of the form f^ n : (A,/) — > (A, / ( - <T "' ) ) 
prevent the definability of answers, to the first question if / is not over the fixed 
field of any power of a, and to the second question if / is over some power of 
a. However, it follows from Theorem 14.41 that these are the only obstructions. 
Furthermore, this is an obstruction to question l5. 0.11 2 if p is also defined over some 
fixed field. In particular, if / is not over any fixed field, there are only finitely many 
definable finite-to-finite correspondences from / to itself, so the model-theoretic 
algebraic closure on /" is finite. 

Lemma 5.1.1. The following properties of two trivial polynomials f and g of the 
same degree are first-order definable in ACFA: 

3a, b f — b o a and g = b o a" 
3h 3N, M < 2 • deg / (P M a h = f o P M and P N o h = g o P N ) or 
(Cm h = f o Cm and Cn h = g o Cn) 
If f is defined over the fixed field of a m for some m, 3ng = f( a ) is also definable. 

Lemma 5.1.2. For fixed trivial polynomials f and g of the same degree, the fol- 
lowing properties of a pair of points A £ f* and B G g" are definable: 

3a, b f = b o a and g = b o a? and B — a(A) 

3h 3N, M < 2 • deg f(P M o/i = /o? M and P N ah = go P N and A N = B M ) or 

(Cm h — f o Cm and G'n o h — g o Cn and G'n(A) — G'n(B)) 

If f is not defined over the fixed field of o~ m for any m, then there is at most one 
n such that g = f® n , so 

3ng = / ( ° andB = f n (A) 

is also definable. 
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Note that for the last items in both lemmata, the requirement on / is not only suf- 
ficient but also necessary: otherwise, there is a countable family of almost-disjoint 
difference subvarieties of (A 2 , / x g) so definability would produce an infinite 
definable family of such, contradicting triviality. Combining these lemmata with 
Theorem 13.41 gives: 

Proposition 5.1.1. For a fixed trivial f , the property "g JL f" is definable if and 
only if f is defined over the fixed field of o~ m for some m. 

For fixed trivial f and g with f JL g the property "A G /" and B G and 
B G acl(A) " is definable if and only if f is not defined over the fixed field of a m for 
any m. 

Here acl(A) is the model-theoretic algebraic closure of A U k where k is a small 
model of ACFA over which everything is defined. 

A special case / = g of the second part of the proposition gives 

Corollary 5.1.1. For a generic A G acl(A) n/" is finite if and only if f is not 
defined over the fixed field of a m for any m. 

In another direction, using Proposition 1.1 in [7] we show that these trivial 
minimal sets are in fact strongly minimal: 

Proposition 5.1.2. For / a nonlinear polynomial the minimal set /" is strongly 
minimal unless (L a o f o L^ 1 )(x) = x k ■ u(x) p for some prime p, some integer k, 
and some linear polynomial L, and cr(C) = C k f or some primitive pth root of unity 
£. In that case, it is a finite union of strongly minimal sets. 

Proof. With an easy computation of ramification indices one deduces from Propo- 
sition 1.1 in [7j that any infinite-coinfinite subset S of f« is defined by U 3B G 
g* 7r(_B) = A" for some polynomials tt and g such that ir a o g = f o tt. Our com- 
putations show that it suffices to show that this is impossible in the case that tt is 
indecomposable, and that tt 17 o g = f ott either is a single skew-twist, or comes from 
Theorem 13.31 

In the first case, we show that n : g$ — > /" is onto , so S = /" is not coinfinite. 
Indeed, in this case / = p o n 17 and g = p o tt for some p. Given A € let 
B := cr~ 1 (p(A)). Note that p is a morphism of a- varieties from /" to {g")^ because 
g a = p a o ir a . Then p(A) G {g a f, so B G gK On the other hand, w(B) = 
n(a-\p(A))) = a-\n{p{A))) = a~\f(A)) = A. 

In order for the second case to be relevant, we must have f(x) := (L a o f o 
L^ 1 )(x) = x k ■ u{x) p for some prime p, some integer k, and some linear polynomial 
L. To lighten notation, we work with / instead of /. Then ir(x) = x p and g{x) = 
x k ■ u(x p ). If A G then a must take points in the fiber of 7r above A to the 
points in the fiber of tt 17 = tt above f(A). Fix some B G tt^ 1 (A) and a primitive 
pth root of unity £, and note that g(B) is in the fiber of tt' 7 = tt above f(A). So 
cr(B) = g ■ g(B) for some pth root of unity g. 

What happens with other points in the 7r-fiber above Al Note that tt^ 1 (A) = 
{( l B}i<k for some primitive pth root of unity £, and that g{QB) — (QB) k ■ 
u((CB) p ) = C k g(B) while a(C ■ B) = cr(C) ■ <r{B), so 

a(C-B) = ^lg(C-B) 
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If i s a primite pth root of unity, we can find i such that - = 1 and 

then C ■ B G 7r _1 (A) n 5 J shows that tt /Ms onto. 

Otherwise, the — 1, and all points in n^ 1 (A) belong to (rj ■ g)\ so the 

formulae "3B G (rj ■ g)$ such that B p = A" define p disjoint infinite coinfinite 
subsets of one for each pth root of unity 77. 

However, this can only happen finitely many times, as each time the out-degree 
of g is strictly less than the out-degree of /. □ 

5.2. Density of dynamical orbits. In this section we apply Theorem 14.41 to 
deduce a version of a conjecture of Zhang on the density of dynamical orbits. 
Let us recall Zhang's conjecture. 

Conjecture 5.2 (Conjecture 4.1.6 of [21]). Let K be a number field and f : X — > X 
a polarizable dynamical system over K . Then there is point a G X{K alg ) algebraic 
over K whose forward orbit Of (a) := {/°"(a) : n G Z + } is Zariski dense in X . 

The dynamical systems we have been considering, namely, (A n ,$) given by 
coordinatewise univariate polynomials as above, do not fit Conjecture E21 as stated 
for a couple of reasons. First, as A n is affine, no dynamical system on A" can be 
polarized. More seriously, even if we pass to a projective closure, the hypothesis of 
polarizability forces all of the polynomials involved to have the same degree. We 
shall prove that there are dense orbits without these restrictions. 

In light of our results and a geometric version of Conjecture 15.21 due to Amerik 
and Campana [T], we propose a more general conjecture on the density of dynamical 
orbits. 

Conjecture 5.3. Let K be an algebraically closed field of characteristic zero, X 
an irreducible algebraic variety over K , and $ : X — > X a rational self-map. We 
suppose that there does not exist a positive dimensional algebraic variety Y and 
dominant rational map g : X — > Y for which jo$ = j generically. Then there is 
some point a G X{K) with a Zariski dense forward orbit. 

We shall prove the instance of Conjecture 15.31 in which X is affine space and $ 
is given by a sequence of univariate polynomials. 

Theorem 5.4. Let K be a field of characteristic zero, /i,...,/ n G K[x] non- 
constant polynomials over K in one variable. Suppose that the linear polynomials 
amongst the fi 's are independent in the sense that if fi are linear for i 6 I C 
{1, . . . , n} then the Zariski closure of the subgroup o/GL|/| generated by ( fi)iei has 
dimension \L\. Let $ : A^- — » A^ be given by (x\, . . . , x n ) \— * . . . , f n (x n )). 

Then there is a point a G A™ (if) for which 0$(a) is Zariski dense. 

Remark 5.5. As one sees from the proof, in some sense almost every point in A n (K) 
has a Zariski dense orbit. We do not pursue the issue of giving a quantitative 
treatment of this observation. 

Remark 5.6. As the reader will see, the notion of independence is exactly what is 
required for Theorem 15.41 to holds for a sequence of linear polynomials. We do not 
pretend that the inclusion of linear polynomials in this statement is deep, but we 
have included them as there is little extra work involved in doing so and they round 
out the statement. 
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Remark 5.7. Theorem 15.41 may be read as saying that there are points a £ A n (K) 
having the property that for no positive integer N is $ oAr (a) contained in any 
proper difference subvariety of (A™, $) when K is treated as a difference field with 
a = idpc- In fact, we will prove Theorem 1 5. 41 by explicitly describing the irreducible 
difference subvarieties of (A™, 4> oM ) for all M £ Z + and then observing that there 
are points in A n (K) whose forward orbits miss all such difference subvarieties. 

We prove Theorem 15.41 as a consequence of a number of simple lemmata. 

Lemma 5.7.1. Let f : X —> X be an algebraic dynamical system over some field 
K with X being irreducible. A point a £ X(K) has a Zariski dense forward orbit 
if and only if there is no natural number m and proper f -invariant subvariety (not 
necessarily irreducible ) of X containing f° m (a) . 

Proof. For any point a £ X(K), as f(O f (a)) = O f (f(a)) C 0/(a), for m > the 
variety Of(f° m (a)) is an /-invariant subvariety of X. Hence, if Of (a) is not Zariski 
dense in X, then O f(f° m (a)) is a proper /-invariant subvariety of X. Conversely, if 
f om (a) £ Y C X and Y is /-invariant, then O f (a) C Y(K) U {/ ol (a) : < i < m} 
so that Of (a) CYU {f oi (a) :0<i<m}CX. □ 

Lemma 5.7.2. If f : X — > X is an algebraic dynamical system over some field 
K , X is irreducible, and a £ X(K) has a Zariski dense forward orbit, then for any 
m £ Z +; X = C/o m (a) 

Proof. For i = 0, . . . , m-1, let := 0/°m(/°*(a)). Then as O f (a) = U ™ O f o m (f m 
we have X = UTio 1 H ence j = f° r some i. As X has a dense /-orbit, the 
map / : X — > X is necessarily dominant (otherwise, Of(a) C {a} U /(X) C X). 
As / maps Z 3 to Zj+i ( m odm)j we must have X = Zj for all j. In particular, 
A" = Z = O f o m (a). □ 

Lemma 5.7.3. Suppose that f : X — » A a?i(i g : V — > 1" are algebraic dynamical 
systems over the field K, (A,/) A (A, <?),. md that there are rational points a £ 
X{K) and b £ Y{K) with Of {a) = X and O g (b) = Y. Then {f>g) (a,b) = X x Y. 

Proof. Let Z := Otf tg \(a,b) be the Zariski closure of the forward (/, g)-orbit of 
(a, b). As (f,g){0(f >g ){a,b)) C 0(f tg )(a,b), the variety Z is (/, g) -invariant. As 
(A, /) A ( Y, , Z must be a finite union of varieties of the form A x B where 
A £ A is / om -invariant for some m and B C F is 5°^-invariant for some £. Taking 
a common multiple, we may assume that all such A and B are invariant for the 
same iterate m of / or g, respectively. Let A x B be a component containing (a, 6). 
By Lemma [5T7^1 A = /m (a) £ A £ A and Y = O gm (b) C S C F. Hence, 
A x y = O u , g) (a,b). □ 

The next few lemmata (Lemma 15.7.41 Lemma 15.7.51 and Lemma 15 .8 . L[) are all 
well-known, but we include them for completeness. 

Lemma 5.7.4. If fx, . . . , f n £ K[x] are independent linear polynomials over a field 
K of characteristic zero, then there is a point a £ A n (K) for which 0(/ 1 ,...,/ n )(o) 
is Zariski dense in A™ . 

Proof. After a making a linear a transformation, we may assume (/1 (x) ,...,/„ (x) ) = 
(\iXi,...,\ n Xn) or (fi(x), . . . , f n (x)) = (A1X1, . . . , Xn-ix n -i, x n + 1) where the 
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Ai's are multiplicatively independent. By Lemma 14.3.11 relative to this presenta- 
tion, any a 6 G r r l n (K) will do. □ 

Lemma 5.7.5. If N±, . . . , N n £ Z + are positive integers each greater than one and 
K is a field of characteristic zero, then here is a point a £ (K x ) n = G^: n {K) with a 
dense f -orbit where f : GJ^ — > G 1 ^ is given by (xx, ■ ■ ■ ,x n ) \— » (x^ 1 , . . . ,x^ n ). 

Proof. As we noted above each /-invariant variety is a union of translates of al- 
gebraic subgroups of GJ^ by torsion points. Thus, by Lemma l5.7.1[ we need only 
find a point a 6 G^K) which does not belong to any translate of an algebraic 
subgroup by a torsion point. Simply take a = (ai, . . . , a„) so that the multiplica- 
tive group generated by ai,...,a n has rank n. For example, let these points be 
n distinct rational primes. The unique factorization theorem for Z says that this 
choice works. □ 

Remark 5.8. In Lemma 15.7.51 it is not true that every translate by a torsion point 
of an algebraic torus is invariant. 

Lemma 5.8.1. IfNi,...,N n S Z + are positive integers each greater than one, 
t < n, and K is a field of characteristic zero, then there is a point b £ A n (K) with 
a dense g :— (CjVn ■ ■ • , CW t; P/Vt+n P/v„)-° r &^- 

Proof. Let a 6 GJ^ (K) be a point with a dense (Pa^ , ■ • • , P/v„ )-orbit given by 
LemmaEZll The map h : (G£, (P Nl , . . . , P;vJ) -> (A", (C^ , . . . , Cjv t , , . . . , P N 
given by (x%, . . . , x„) h-> (xi + . . . , x t + -^,x t +i, ■ . . ,x n ) is a dominant map of 
dynamical systems. Hence, if we set b :— h(a), we have O g (b) = A™. □ 

Lemma 5.8.2. Let K be a field of characteristic zero and f and g two polynomials 
over K of degree at least 2 neither of which is linearly conjugate to a monomial or 
a Chebyshev polynomial. Suppose that R C K is a subring of K over which some 
decompositions 

f = fk ° ' • ■ ° fx an d 9 = 9r ° • ■ • °9i ar £ defined and over which each of the leading 
coefficients of the polynomials in the decompositions is a unit. Then if C C A^- is 
an (f° m ,g° m ) -invariant curve for some m £ Z + and (a, b) S C{K) with a e R, 
then b is integral over R. 

Proof. This follows immediately from Theorem 13.41 The curve C is a component 
of a composite of correspondences coming encoded by the (3 and <p operators for 
the decomposition of /, hence as graphs of the given components of / and their 
converse relations, algebraic tori, and then skew twists of the decomposition of 
g om , each of which is given by a polynomial over R. Following a through these 
correspondences we see that in each step either we apply a polynomial defined over 
R (and thus maintain integrality over R) or we extract a root to an equation of the 
form h(x) = c where c is integral over R and h is a polynomial over R with a unit 
as leading coefficient. □ 

Lemma 5.8.3. Let K be a field of characteristic zero and /i,...,/n € K[x] a 
sequence of nonconstant polynomials over K . We assume that each fi has degree at 
least two and is not linearly conjugate to a Chebyshev polynomial or to a monomial. 
Then there is a rational point a — (oi, . . . , a n ) 6 A n (K) with a dense (/i, . . . , /„)- 
orbit. 
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Proof. Let R C K be some finitely generated subring over which complete decom- 
positions of each fi are defined and the leading coefficient of each indecompos- 
able factor is a unit. We argue by induction on i that we can find some finitely 
generated ring B containing R and contained in K for which there is a point 
(di, ...,ai) G A l (B) with O/fo f t )(a) Zariski dense in A\ In the case of i = 1, 
the result follows by height considerations (for example, by embedding R C C if we 
take a G R with \a\ 3> 0, then liirim—xxj /f (a) = oo so that, in particular, a is not 
preperiodic). 

In the inductive case, we have (oi, . . . , a,) G A l (B) with a Zariski dense (/i, . . . , /.;)- 
orbit. Let a^i G X be any element of if which is not integral over B. Then for 
every m, f om (a n +i) is also non-integral so by Lemma [5.8.21 ( f om (aj), f om (a.; + T)) 
does not belong to any (/° m , /"^-invariant curve. By triviality, it follows that 
(f° m (a), . . . , f°™(a)) does not belong to any (/{"", . . . , -invariant variety. □ 

Combining these lemmata and with Theorem l4.4l we conclude that Theorem 15. 41 
is true. 

5.3. Difference equations for Probenius lifts. In this section we observe that 
for dynamical systems lifting the Frobenius, one can capture the periodic points 
with a difference equation. Consequently, our results on the structure of difference 
varieties imply strong restrictions on the algebraic relations among the periodic 
points of such dynamical systems. 

In what follows, K is a field with a valuation v, ring of integers R :— {x G K : 
v(x) > 0}, maximal ideal m :— {x G R : v(x) > 0}, and residue field k := R/m 
of characteristic p > 0. We assume that a : K — > K is an automorphism lifting 
the p-power Frobenius in the sense that v(o~(x)) — v(x) for all x G K and o~(x) = 
x p mod m for x G R. We assume moreover that K is maximally complete and 
algebraically closed. The results we prove about periodic points descend from K 
to subfields, so the reader may comfortably drop these last two hypotheses, but 
some of our intermediate results require at least completeness. Ultimately, we shall 
assume that K has characteristic zero, but for now, this is not necessary. 

If X is a scheme over R, then we write Xo for the base change of X to k and 
X v for the base change of X to K . We write 7r : X(R) — > Xo(k) for the natural 
reduction map. 

With Theorem 15.91 we show that difference equations given by liftings of the 
Frobenius give dynamical Teichmuller maps. Towards the end of this section we 
specialize to the case of dynamical systems given by sequences of univariate poly- 
nomials and thereby deduce form our earlier work that algebraic relations amongst 
periodic points of such systems are highly restricted. 

Theorem 5.9. Let X be a separated scheme of finite type over R. We assume that 
X is smooth over R. Suppose that T C X x X a is a closed subscheme of X x X a 
for which the projection T — > X is etale. Suppose moreover that q — p n is a power 
of p and r lifts the Frobenius in the sense that some component of the special fibre 
To is the graph of the geometric q-power Frobenius morphism F : Xq — > X^ . Then 
the reduction map ir : X(R) — » Xo(k) restricts to a bisection between (X, T)$(R, a n ) 
and Xo(k). 

Proof. To ease notation let us write p := a n . 

Let us first show that it : (X, T)^(R, p) — > Xo(k) is surjective. Let a G Xo(k) be 
any fc-rational point on Xq. Pick any point a G X(R) with 7r(a) = a. From the 
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hypothesis that A is smooth over i?, we may fix an etale covering / : U — » AJ! where 
a G U(R), U C A is an affine open subset and /(a) = 0. Note that f : U a -» A% 
gives analytic coordinates on X a near o~(a). 

As T -> X is etale, the set (/ x f a )(T(R) n tt" 1 ^} x (^rU^V)}) is the 
graph of an analytic function g : m m — * m m where <?(xi, . . . , x m ) = (a;', . . . , x^) 
mod m • . . . , x m ]]. That we can find a solution to g{x) = er(x) follows from 

Newton's method (see [19] in this context). 

That is, if for some 7 > we have a solution to g(x) = o~(x) mod 7 7 where 
J 7 := {x G R : > 7}, we can find some x' with x = x' mod J 7 but <?(a;) = cr(x) 
mod J~+ := {a; G i? : > 7} and then taking limits we find a true solution with 
in the given neighborhood. In our case, we already know that g(0) — mod m = 
I Q + . Given an approximate solution x, suppose that g{x) = <j(x) mod I 1 with 
7 > 0. Let e € R with v(e) — 7. We seek to find x' = x + ce with c — (ci, . . . , c m ) 
and u(cj) > for each i. We have g(x + ce) = g(x) + Y^7=i Wx~( x ) ce + e2 * = 
g(x) mod I 1 + as -§xr{X) = qXf mod mi2[[Xi, . . . , X m ]]. On the other hand, 
a{x + ce) = a(x) + a(c)a(e) = a(x) + (cf , . . . , c^)er(e) mod /„+. Subtracting, we 
need only solve cr(e)(cf , . . . , c^) = 5(2;) — cr(a;) mod 7 7 + . By hypothesis, each 
component of g(x) — cr(a;) has valuation at least 7 = v(a(e). As fe is perfect, we 
may solve these equations. 

These calculations demonstrate that the restriction of tt to (X, F)"(iZ, p) is in- 
jective as well since the solution c— (ci, . . . , c n ) is uniquely determined modulo m. 
Since we know the residue of the solution, this shows that the reduction map is 
injective. □ 

Corollary 5.9.1. With X and V as in Theorem \ 5.9l for any natural number N 
onehas(X,T)*(R,p)^(X,T <>N Y(R,p N ). 

Proof. A composite of etale extensions is etale. Hence, the hypothesis of Theo- 
remEU apply to X, T oN , and mN. So, tt : (X,T <,N )^{R, p N ) -> X (k) is also a 
bijection. As (X, T)»(i?, p) C (A, r oAr )»(i?, p N ), these sets must be equal. □ 

Specializing Y somewhat, we may use Theorem l5.9l to find a difference equation 
for periodic points. 

Theorem 5.10. Let X be a separated scheme of finite type over R, smooth over R 
and f : X — > X a morphism lifting the q = p n -power Frobenius. Let p := a n . We 
assume that f — f p and X = X p . Then every f -periodic R-rational point belongs 
to(X,f)*(R,p). 

Proof. Let b G X(R) be an /-periodic point of order M. There are only finitely 
many solutions to f° M (x) = x (as, for instance, this is true on the special fibre). 
Hence, p N (b) = b for some N > 0. Thus, b satisfies p MN (x) = f oMN (x). That is, 
b G (A, f oMN )$(R, p MN ) which is (A, p) by Corollary EMi □ 

Remark 5.11. Theorem l5.10l holds for / analytic. This observation yields interesting 
information in the case that A is a moduli space of abelian varieties, T C A x A is 
a p-power Hecke correspondence, and / : A — + A (or, really, / is defined on some 
dense open subset) is a branch of F lifting the Frobenius. In this case, the difference 
equation captures the canonical lifts. (See [18] for more details.) 

Remark 5.12. If in Theorem 15.101 we assume that k = Fp lg , then as every point in 
X(k) is /-periodic, every point in (X,f)$(R,p) is /-periodic. 
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Remark 5.13. This method of obtaining interesting difference equations for peri- 
odic points by lifting equations on the Frobenius has been used in the study of 
Manin-Mumford questions [111 I16j . When more structure (for instance, a group) 
is available, then more complicated equations beyond simply f(x) = <j{x) may 
be used to give deeper information. We expect that these equations in the more 
general dynamical context will be useful, but we have not pursued this issue. 

Let us conclude by specializing to the case of sequences of univariate polynomials. 

Theorem 5.14. Let q = p e be a power of p. We suppose that K has charac- 
teristic zero. Let f\,...,f n 6 R[x] be polynomials with fi(x) = x v mod mi? [x] 
for each i < n. We suppose that for some m > each ft = ff for each i. If 
X C Aj£- is an irreducible subvariety containing a Zariski dense set of points of 
the form . . . , £„) where Q 6 R is fi-periodic, then X is a difference subvariety 
of (A™, (/* m , . . . , fn™ 1 )) and has the shape described in Section \5.SX Moreover, if 
deg(/j) = q for each i, then we may replace the hypothesis "Q £ R" by Q G K." 

Proof. By Theorem 15.101 the (/i, . . . , /„)-periodic points in A n (R) are all con- 
tained in (A", (/f™, . . . , fZ m ))HR, a lm ). Hence, if X contains a Zariski dense set 
of periodic points from A n (R), then X n (A™, (ff m , a lm ) is Zariski 
dense in X implying that X is a difference subvariety of (A™, (/f" 1 , . . . , /^ m ))- The 
description of X now follows from our description of such difference varieties. 

For the "moreover" clause observe that if deg(/j) = q, then every /^-periodic 
point is integral over R, and, hence, actually an element of R as R is integrally 
closed in K . □ 

Remark 5.15. Further specializing Theorem 15.141 one obtains statements about 
algebraic relations amongst the periodic points of polynomial without reference to 
valuations. For example, let q be a power of a prime number p. Suppose that 
f{x) = x q + pg(x) where g(x) £ 1\x\ and deg(g) < q. Suppose moreover that 
/ is not linearly conjugate to a monomial or a Chebyshcv polynomial and that 
/ is not a compositional power. Then every irreducible variety X C A^ which 
contains a Zariski dense set of n-tuples of /-periodic points is defined by a sequence 
of equations of the form f(xi) = Xj or f(xi) — a for a some fixed /-periodic point. 

The remark is true because all compositional factors of / have degree a power of 
p. In particular, no two have relatively prime degrees, so no Ritt swaps are possible 
amongst them. 
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